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TTNTOTTE ASSOCIATED KAPOSI'S SARC OMA VIRUS SEQUENCES AND 
USES THEREOF 

The invention disclosed herein was made with 
Government support under a co-operative agreement 
CCU2106 52 from the Centers for Disease Control and 
Prevention, and under National Institutes of Heaitn.. 
National Cancer Institute award CAS73 51 cf the 
Department of Health and Human Services . Accordingly, 
the U.S. Government has certain rights in this 
invention . 



Throughout this application, various publications may 
be referenced by Arabic numerals in brackets. Full 
citations for these publications may be found at the 
end of the Detailed Description of the' Invention. The 
disclosures cf all publications cicec herein are in 
their entirety hereby incorporated by reference into 
this application to more fully describe the state of 
-v^e ar t to which this invention pertains. 
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T3*rynROUNP HP THE I NVENTION 

Kaposi's sarcoma-associated herpesvirus (KSHV* is £ 
new human herpesvirus (HHVBJ believed c? cause 
5 Kaposi's sarcoma (KS) [1,2] . 

Kaposi's sarcoma is the most common neoplasm occurring 
m persons with acquired immunodeficiency syndrome 
(AIDS; . Approximately 15-20% of AIDS patients develop 
10 this neoplasm which rarely occurs in immunocompetent 

individuals. Epidemiologic evidence suggests that 
AIDS-associated KS (AIDS-KS) has an infectious 
etiology. Gay and bisexual AIDS patients are 

approximately twenty times more likely than 
15 hemophiliac AIDS patients to develop KS , and KS may be 

associated with specific sexual practices among gay 
men with AIDS. KS is uncommon among adult AiDS 
oatients infected through heterosexual or parenteral 
HIV transmission, or among pediatric .AIDS patients 
2C infected through vertical HIV transmission. Agents 

previously suspected of causing KS include 
cytomegalovirus, hepatitis 3 virus, human 

papillomavirus, Epstein-3arr virus (E3V-, human 
herpesvirus £, human immunodeficiency virus (HIV) , ana 
25 Mycoplasma penetrans. Non- infect icus environmental 

agents, sucn as nitrite inhalants, also have been 
proposed to play a role m KS tumorigenesis . 
Extensive investigations, however. have not 
demonstrated an etioiogic association between any of 
3 0 these agents and AIDS-KS. 
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This inversion provides an isolated nuc.e:: s::c 
molecule which encodes Kaposi's Sarcoma -Asscciatec 
Herpesvirus <KSHV) polypeptides. . This -.ver.::cr. 
orovides an isolated polypeptide molecule of r'SHV . 
This invention provides an antibody specific tc the 
polypeptide. Antisense and triplex oligonucleotide 
molecules are also provided. This invention provides 
a vaccine for Kaposi's Sarcoma <KS). This invention 
orovides methods of vaccination, prophylaxis, 
diagnosis and treatment of a subject with KS and of 
detecting expression of a DNA virus associated with 
Kaposi's sarcoma in a cell. 
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F-i qiire 1: 

Annotated long unique region (I/OR: and termina. 
repeat (TR) of the KSHV genome. The orientation 
of identified ORFs in the LUR are denoted by the 
direction of arrows, with ORFs similar to KYS tn 
dark blue and dis - similar ORFs in light blue. 
Seven blocks (numbered) of conserved herpesvirus 
genes with nonconserved interblock regions 
(lettered) are shown under the kiiobase marker; 
che block numbering scheme differs from the 
original description by Chee ( Chee ez ai . , 199C, 
Curr. Topics Microbiol. Immunol- 154, _ 2 = - _ c 9 ; . 
the overlapping cosmid (Z prefix; and lambda ( L 
prefix) clones used to mar the KSHV genome are 
compared to the KS5 lambda phage clone from a KS 
lesion and shown below. Features and putative 
ceding regions net specifically designated are 
shown above the ORF map. Repeat regions are 
shown as white lines (frnk, vnee . waka/jwka, 
zppa, mci, mask;. Putative coding regions and 
ether features (see Experimental Details Section 
:;■ net designated as ORFs are shown as sclid 
25 lines . 

Figure 2A-2D: 

(Fig. 2A) Sequence cf terminal repeat unit (TR) 
demonstrating its high G-C content ISZZ -2 

30 NO:l6). Sequences highly similar to renservee 

herpesvirus pad sites are underlined with less 
similar sites to specific pacl and pac2 sequences 
italicized. (Fig. 23) Scutnern blot cf DNA from 
BC-1 (lane 1), BC?-1 (lane 2: and a KS lesion 

35 I lane 3) digested with Nasi I which cuts once in 
the TR sequence and probed witn a plasma 
■ ^- " — ■ ^A^MAn^o intense 
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hybridization band at 0.8 kb represents multiple 
copies of the Ndell -digested single unit T?. -:Fig. 

A schematic representation (Fig. 2 Z ■ or 
_enome structures of KSHV in BCF-I and 3C- 1 cel. 
5 lines consistent with the data presented ;r. (Fig. 

2B) and (Fig. 2D) . TaqI (T) sites flank the TR 
regions and Nde II (N) sites are within the TRs . 
Lower case tr refers to the deleted truncated TR 
unit at the left end of the unique region. DR 
10 _ represents the duplicated region cf tne LIT?. 

buried within the TR. (Fig. 2D) Southern blot 
hybridization with TR probe of DNA fro- 3C-1 
vlane 1), BCP-1 (lane 2), a KS lesion (lane 3). 
and HBL-6 {lane 45 digested with Tag I, which 
1S does not cut m the TR . Taq I -digested DNA from 

bcth BC-1 (lane 1) and HBD-6 (lane 4" show 
similar TR hybridization patterns suggesting 
' identical insertion of a unique sequence into the 
TR recior., which sequencing studies demonstrate 
2 q is a duplicated portion of the l/JR (see 

Experimental Details Section) . SCR- 1 TR 

hybridization (lane 2} shows laddering consistent 
with a virus population having variable TR region 
lengths within this cell line due tt lytic 
25 replication. The absence of TP. laddering m KS 

lesion DNA (lane 3) suggests that a clonal virus 
population is present in the tumor. 

Figures 3A-3C: 

30 CLUSTA1.- W alignments cf KSHY-encoded polypeptide 

sequences to corresponding human eel- signaling 
pathway polypeptide sequences. Fig. 3A. Two KSn\7 
KIR- like Dclypept ides (vKIF-I and vMI ? - I I . are 
compared to human MIF-la, MIP-l£ and RAXTES 

35 (amino acid identity to vMI?-I indicated by black 

reverse shading, to vMIP-II alone by gray reverse 
shading, and the C-C dimer motif is italicized; - 
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3o th KSHV MIP genes encode 19 residue N-terr.mus 
hydrophobic secretory leader sequences which are 
relatively poorly conserved (vMI?-I also has a 
second C-C dimer in the hydrophobic leader 
sequence without similarity to the chenokme 
dicysteme motif i . Potential O-Iinked 

glycosyiation sites for vKIP-I (gapped positions 
22 and 27) are not present in vMIP-II, which has 
only one predicted potential serine glycosyiation 
site (position 51) not found in vMIP-I. Fig. 3B. 
Alignment of the KSHV vIL-6 to human XL- 6. Fig. 
3C-1 and 3C-2. Alignment of the KSKV vIRF 
polypeptide to human ICSBP and ISGF3 with the 
putative ICS -binding typtophans fW) for ICSHF ana 
ISGF3 in italics. 



Figures 4A-4F: 

Northern hybridization of total RNA extractec 
from ECP-1 and BC-1 cells with or without 4S hour 

20 incubaticn with TPA and control P3KR1 cells arter 

T?A incubation. All four genes (Fig. 4A, vMIP-I; 
Fig- 4E. vMIP-II: Fig. 4C. vIL-6; Fig. 4D, vIRF) 
are TP A inducible but constitutive, r.oninduced 
expression of v!L-£ (Fig. 4C) and vIRF (Fig. 4D; 

25 is also evident for BCP-i and 3C-1 and of vMI?-Z 

for 5CF-1 (Fig. 4A) . Representative 
hybridizations tc a human /S-actin probe -rigs. 
4F-4F:- demonstrate comparable loading of RICA ror 
cell preparations . 



Figures 5A-5B: 

Fig. 5A. Immunobicc of rabbit antipeptice 
antibodies generated from amino acid sequences or 
vID-6, TKYSFPKFDR (SE3 ID NO : 2 ■ and ?DVT?DVKDR 
f SEQ ID NO : 3 ) , against cell iysates oz B - ? " 1 ' 
3C-" , P3KR.1 cell lines with anc witnout i r>*. 
induction (lanes I - c ) , 1 ug human r!l-€ '.lane 
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and concentrated COS 7 rvIL-c ana c - — 
supematants (lanes 8-9) . Anti-vIL-6 antibodies 
specifically recognize the viral IL-c pc.ypec ::a= 
in both recombinant supematants and ce_i --- s = 
but not human IL-6. The BCP-1 ce__ _me 
constitutively expresses low levels cz vIL-c 
whereas polypeptide expression increases on T?A 
treatment for both BC-1 (KSHV and EBV ccmfected; 
and BCP-1 ( KSHV infection alone; indicating _ytic 
phase expression. Pre immune sera from immunizes 
rabbits did not react on immunoblotting tc any cz 
the preparations. Fig. 5B . Anti -hu.u- e 

monoclonal antibodies do not cross -react witn 
cell-associated or recombinant vIL-€ 

crsDarations . 

Figure 6 : 

Dose -response curves for J H - thymidine uptake m 
IL- 6 -dependent B9 mouse plasmacytoma cells ■ witn 
serial dilutions of rnuIL-6 (filled squares) and 
C0S7 supernatant s of rvIL-6 (fi-_ed tirc.es ; , 
r6-LIv (open squares) or control LacZ ' open 
circles) pMET7 transf ect ions . Undiluted rvIL-6 
supematants from this transfecticn let show 
similar BS proliferation activity nuIL-£ >:.C2 
ng/ml whereas the reverse construct ( r£ -LIv ■ anc 
the LacZ control show no increased ability to 
induce B9 proliferation. Concentrated 
supematants at greater than 1:2 dilution may 
have increased activity due to cencentrat ion or 
CCS* 7 conditioning factors. 

T?ir ym-f>c 7A-7F: 

Rabbit anti-vIL-6 peptide antibody reactivity- 
localized using goat - ant irabbit immunoglocui m- 
per oxidase conjugate '.brown) with hematoxylin 
counterstaining (blue) at Xic: magnification 
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demonstrates vIL-6 production ir — — 
KSKV- infected cell lines and tissues. The 
KSHV- infected cell line BCF-1 (Fig- 7A ' - cu ~ n -~ 
the control EBV- infected cell line P2HR1 -'Fig. 
73) , shows prominent cytoplasmic vIL-c 
localization. (Fig. 7C) Cytoplasmic localization 
of vIL-€ in spindle-shaped cells from an AIDS - K5 
lesion. Of eight KS lesions, only one had 
readily identifiable vIL-6 staining cf a 
subpopulation of cells. In contrast, the 

ma 3 ority of pelleted lymphoma cells from a 
nonAIDS, EBV-negative PEL have intense vIL-6 
staining (Fig. 7E) . No immunostaining is present 
in control angiosarcoma (Fig. 7D; or multiple 
myeloma tissues (Fig. 7F) . 



F-icmres 8A-8D.L 

Double antibody labeling of anti-vIL-6 and cell 
surface antigens. Examples of both CD34 and ZD20 
20 colocaiization with vIL-6 were found ir. a KS 

lesion. Fig. 3A. CD34 (red) and vIL-6 cclocalize 
:biue) a KS spindle cell (arrow) . Purple 

coloration is due to overlapping chromagen 
staining ilOOX). Fig. 8B. CD43 common leukocyte 
25 antigen staining (blue, arrow; on vIL-6 'red) 

expressing Kaposi's sarcoma cells {lOOXI . Fig. 
SC. Low power magnification ( 2 0 X ; demonstrating 
numerous vIL-6 producing hematopoietic ceils 
'.red) in a lymph node from a patient with KS . 
30 Arrows only indicate the most prominently 

staining cells; nuclei count erstamed with 
hematoxylin. Fig. 8D. Colocaiization of CD 2 0 
; brown, arrows) with vIL-6 '.red) in an AIDS-KS 
patient's lymph node tlOOX: . 
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Quantification of CCC/CD4 cell infection by 
primary NSI SF162 and K2 3 KIV-1 strains and HIV- 2 
strain ROD/B in the presence cr absence cz 
vMIP-I . CCC/CD4 cells were transiently 

cotransfected with CCR5 alone, CCR5 plus expty 
pMET7 vector, CCR5 plus vMIF-I m pMET7 vector, 
or CCR5 plus the reverse orientation I-?IMv . The 
results after 72 hours of incubation with each 
retrovirus are expressed as a percentage cf tne 
foci forming units for cells transfected with 
CCR5 alone. The forward vMIP-I construct 

inhibited NSI HIV-1 replication but not HIV - 2 
replication while the reverse I-PIMv construct 
nad no effect on replication of any ci the 
retroviruses . 
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pirTZiTT.'RD DES ^TPTTON OF THE INVENTION 

no r i n ■ t ions 

=; The following standard abbreviations are usee 

throughout the specification to indicate specific 
nuclect ides : 

C=cytosine A=adenosine 
1C T = thymidine G=guanosme 

The term "nucleic acid", as used herein, refers to 
either DNA or RNA, including complementary DNA icDNA:, 
genomic DNA and messenger RNA (mRNA) - As used herein, 
-genomic" means both coding and non-coding regions of 
the isolated nucleic acid molecule. "Nucleic acid 
sequence' 1 refers to a single- or double- stranded 
ooiymer of deoxyr ibonucleot ide or ribonucleotide bases 
read from the 5 ' to the 3 ' end. It includes both 
self -replicating piasmids, infectious polymers cf. DNA 
cr RNA and nonfunctional DNA or RNA. 

The term "polypeptide", as used herein, refers to 
either the full length gene product encoded by the 

or oortions thereof. Tnus .. 



Id 



20 



30 



includes not only the ful.-^engtn 



25 nuclei: <=. ~ i u 

" pc lypept ide " i 
protein, but also partial - length fragments, including 
peptides less than fifty ammo acid residues m 
1 e n cr t h . 

The tern »SSC" refers to a citrate - saline solution of 
0.15 M sodium chloride and 2 0 my, sodium citrate. 
Solutions are often expressed as multiples or 
fractions cf this concentration . For example, SXSSC 
refers to a solution having a sodium chloride and 
sodium citrate concentration of 6 times this amount or 
C.9 K sodium chloride and 120 mK sodium citrate. 
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11 

r 2XSS2 refers to a solution 0.2 rimes me 
concentraticr: or 0.03 K sodium chloride and rr=X 
sodium citrate. 

The phrase "selectively hybridizing to" and tne phrase 
"soecific hybridization" describe a nucleic acic prcoe 
that hybridizes, duplexes cr binds only to a 
particular target DNA cr RNA sequence when the target 
sequences are present in a preparation of total 
cellular DNA or RNA . By selectively hybridizing it is 
meant that a probe binds to a given target in a manner 
that is detectable in a different manner from non- 
caraet sequence under high stringency conditions of 
hvbr idizat ion . 

"Complementary" or "target" nucleic acid sequences 
refer to those nucleic acid sequences which 
selectively hybridize tc a nucleic acid probe. Proper 
annealing conditions depend, for example. upon a 
probe's length, base composition, and the number of 
mismatches and their position on the prcoe, anc must 
cften be determined empirically. Fcr discussions of 
nucleic acid probe design and annealing conditions, 
see, for example. Sambrook e: ai.:i5B5) Molecular 

Cloning: A Laboratory Manual (2nd ed . : . Ccld Spring 
Harbcr Laboratory, Vols. 1-3 or Ausubel, ez al . 

(1987) Current Protocols in Molecular Biology, New 
York . 

The phrase "nucleic acid molecule encoding" refers to 
a nucleic acid molecule which directs the expression 
of a specific polypeptide. The nucleic acid sequences 
include both the DNA strand sequence that is 
cranscnbed into RNA, the complementary DNA strand, 
and tne RNA sequence that is translated into protein. 
The nucleic acid molecule includes bcth the full 
iencth nucleic acid sequence as well as ncn-ru.i 
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length sequences. It being further understood that 
the sequence includes the degenerate codons of tne 
native sequence or sequences which may be mtrocucec 
to provide codon preference in a specific host ce.i. 

A nucleic acid probe is "specific" for a target 
craanism of interest if it includes a nucleotide 
sequence which when detected is determinative of the 
presence of the organism in the presence of a 

10 heterogeneous population of proteins and ether 

bioiogics. A specific nucleic acid probe is targeted 
to that portion of the sequence which is determinative 
cf the organism and will not hybridize tc other 
sequences, especially those of the host- where a 

1^ pat hog en is being detected. 

The phrase "expression cassette", refers to nucleotide 
sequences which are capable of affecting expression of 
a "structural gene in hosts compatible with- such 
20 sequences. Such cassettes include at least promoters 

and optionally, transcription termination signals. 
Additional factors necessary or helpful in effecting 
expression may also be used as described herein. 

25 The term "operably linked" as used herein refers to 

linkage of a promoter upstream from a DKA sequence 
such that the promoter mediates transcription cr the 
DNA sequence . 



The term "vector", refers to viral expression systems, 
autonomous self - repiicat ing circular DNA (piasmids), 
and includes both expression and nonexpression 
plasmids. Where a recombinant microorganism or cell 
culture is described as hosting an "expression 
vector , " this includes both extrachromosomai circular 
DNA and DNA that has been incorporated into the nost 
chromosome (s) . Where a vector is being maintained by 
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a hos: cell, the vector may either be stac-y 
replicated by the cells during Ptosis as ar. 
autonomous structure, or is incorporated wi:r.:r. the 
host ' s genome . 

The term "piasmid" refers to an autonomous circular 
DNA molecule capable of replication in a cell, and 
includes both the expression and nonexpressicn types. 
Where a recombinant microorganism or cell culture is 
described as hosting an "expression piasmid' 1 . :ms 
includes latent viral DNA integrated into tne host 
chromosome (s) . Where a piasmid is being maintained by 
a host cell, the piasmid is . either being stably 
replicated by the cells during mitosis as an 
autonomous structure or is incorporated within the 
host ' s genome . 

The cr.rase "recombinant protein" or " recombinant iy 
produced protein" refers to a polypeptide produced 
20 "using non-native cells. The cells produce the protein 

because they have been genetically aiterec oy tne 
introduction of the appropriate nucleic acid sequence. 

Tne following terms are used to describe the sequence 
25 relationships between two cr more nucleic acid 

molecules: "reference sequence", " comparison winaow" , 
"sequence identity" , "percentage of sequence 
identity", and "substantial identity". A "reference 
sequence" is a defined sequence used as a basis tor a 
sequence comparison; a reference sequence may oe a 
subset of a larger sequence , for example, as a segment 
cf a full-length cDNA cr gene sequence given in a 
sequence listing or may comprise a complete cDNA or 
aene seauence . 



Cotimal alignment of sequences in a comparison wmoov 
may be 



conducted by the algorithm of S...ith ana 
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Waterman (1981) Adv. Appl - Math. 2:462, tne 
algorithm of Needlemar. and Wunscr. (1970) J- Mel. Biol. 
48:443. by the search- f or-simiiarity method cf Pearson 
and Lipman (1988) Pro:. Natl. Acad. Sci . B5: 244-1, cr 
5 by computerized implementations of these algorithms 

(GAP , BESTFIT. FASTA, and TFASTA in GCG , the Wisconsin 
Genetics Software Package Release 8.0, Genetics 
Computer Group, 575 Science Dr., Madison, WD. 



1C 



As applied to polypeptides, the terms "substantial 
identity" or "substantial sequence identity" mean that 
two peptide sequences, when optimally aligned, such as 
by the" programs GA? or BESTFIT using default gap which 
share "at least 90 percent sequence identity, 
preferably at least 95 percent sequence identity, more 
preferably at least 99 percent sequence identity 



rably at least r ? °- 

more . 



"Percentage ammo acid identity" or "percentage amino 
20 acid sequence identity" refers to a comparison cf the 

amino acids of two polypeptides which, when optimally 
aligned, have approximately the designated percentage 
of the same ammo acids. For example, "95% ammo acid 
identity" refers to a comparison of the amino acics or 
25 two polypeptides which when optimally aligned nave 95% 

ammo acid identity. Preferably, residue positions 
which are not identical differ by conservative amino 
acid substitutions. For example, the substitution of 
ammo acids having similar chemical properties, such 
30 as charge or polarity, are not likely to effect the 

properties of a protein. Examples include giutamme 
for asparagme or glutamic acid for aspartic acid. 

;d" or "isolated" when 



The phrase "substantially purine-" o 
referring to a 
chemical composition which is essentially 

o 



herpesvirus polypeptide, means a 



ther cellular components. It is preferably m a 
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homogeneous snare although it can oe in either 3 nry 
or aqueous solution. Purity and homogeneity are 
tycically determined using analytical, cnemistry 
cechnigues such as polyacrylamide gel electrophoresis 
S or hicrn performance liquid chromatography. A protein 

which is the predominant species present in a 
oreparation is substantially purified. Generally, a 
substantially purified or isolated protein will 
comprise more than 80% of all macromolecular species 

10 present in the preparation. Preferably, the prctein 

is purified to represent greater than 90% of ail 
macromolecular species present. More preferably the 
protein is purified to greater than 55%, and most 
preferably the protein is purified to essential 

15 homogeneity, wherein other macromolecular species are 

not detected by conventional techniques. 

The phrase "specifically binds to an antibocy" or 
"specifically immunoreact ive with", when referring tc 
20 a polypeptide, refers to a binding reaction whicn is 

deterT.:na:ive cf the presence of the KSHV polypeptide 
of the invention in the presence cf a heterogeneous 
population cf polypeptides and other biologies 
including viruses other than KSHV . Thus, under 
designated immunoassay conditions, tne specitiec 
antibodies bind to the KSHV antigen and do not omd in 
a significant amount to other antigens present in the 
sample . 



Z 3 



3 0 "Specific binding" to an antibody under sucn 

conditions may require an antibocy tnat is selected 
£ or its specificity for a particular antigen. For 
example, antibodies raised to KSHV antigens described 
herein can be selected tc obtain anticocies 

35 specifically immunoreact ive with KSHV polypeptides ana 

not with other polypeptides. 
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-Biological sample" as used herein refers ::■ any 
sample obtained from a living organism or from an 
organism char has died. Examples of biological 
samples include body fluids and tissue specimens. 

5 _ 

It will be readily understood by those skilled in the 

art and it is intended here, that when reference is 

made to particular sequence listings, such reference 

includes sequences which substantially correspond to 

10 the listing and it's complement, including allowances 

for minor sequencing errors, single base changes, 
deletions, substitutions and the like, such thai any 
such sequence variation corresponds to the nucieic 
acid seauence of the pathogenic organism or disease 

15 marker to which the relevant sequence listing relates. 

t Nucleic Acid Molecule from KSKV 

This invention provides an isolated nucleic acid 
20 molecule which encodes a Kaposi's sarcoma - associated 

herpesvirus {KSKV/ polypeptide. 

In one embodiment, the isolated nucleic acid molecule 
which encodes a KSKV polypeptide has the nucleotide 

25 seauence as set form in GenBank Accession Number 

U75693 and the star: and stop codcr.s set forth m 
Table 1- In another embodiment, the isolateo nucieic 
acid molecule which encodes a KSHV polypeptide has the 
amine acid sequence defined by the translation or tne 

30 nucleotide sequence se: forth m GenBank Accession 

Number U75G-3 and the start and stop codons se: forth 
m Table 1. 

In one embodiment, the isolated nucieic acio moiecuie 
33 for a KSKV polypeptide has the 5' untranslated 

seauence as set forth m GenBank Accession Number 
-j 7 5 £ 9 5 upstream of tne ATG start coocn. In anotner 
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embodiment, the isolated nucleic acid molecule for a 
KSHV polypeptide has the 3' untranslated sequence as 
set forth in GenBank Accession Number 
downstream cf the stop codon , 

In one embodiment the isolated nucleic acid mciecule 
is genomic DNA. In another embodiment the isclatec 
nucleic acid molecule is cDNA. In another embodiment 
RNA is derived from the isolated nucleic acid molecule 
1C or is capable of hybridizing with the isolated nucieic 

acid mciecule. 

Further, the nucleic acid molecule above may be 
associated with iymphoprolif erat ive diseases 
15 including, but not limited to: Hodgkin' s disease, non- 

Hodgkin's lymphoma, lymphatic leukemia, lymphosarcoma, 
splenomegaly, reticular cell sarcoma, Sezary's 
syndrome, mycosis fungoides, central nervous system 
lymphoma, AIDS related central nervous system 

20 lymphoma, post- transplant iymphoprolif erative 

disorders, and Burkitt's lymphoma. A Iympho- 

prolif erative disorder is characterized as being the 
uncontrolled clonal or polyclonal expansion of 
lymphocytes involving lymph nodes, lymphoid tissue and 

25 other organs . 

A . Isolation and Propagatio n of KSHV 

KSHV can be propagated in vi crc . For example. 
30 techniques for growing herpesviruses nave been 

described by Ablashi e: al . in Virology 164 : 54 5-552. 

Briefly, PHA stimulated cord blood mononuclear cells, 

macrophage, neuronal. or glial cell lines are 

cocultivated with cerebrospinal fluid, piasma, 
35 peripheral blood leukocytes, or tissue extracts 

containing viral infected ceils or puririeo virus. 

The recipient cells are treated with 5 ag/ml po.ybrene 
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for 2 hours at 37° C prior to infection. Infertec 
cells are observed by demonstrating morphcicgi ca- 
changes, as well as being viral antigen positive. 

5 ?or KSHV isolation, the virus is either harvested 

directly from cell culture fluid by centrif ugat icr. . or 
-he infected cells are harvested, homogenized or lysed 
and the virus is separated from cellular deons and 
purified by standard methods of isopycnic sucrose 
10 density gradient centrif ugation . 

One skilled in the art may isolate and propagate KSHV 
employing the following protocol. Long- term 

■ establishment of a 3 lymphoid cell line infected with 
KSHV ie.g.. RCC-l. K3L-6 or BCBL-1) is accomplished 
using body-cavity based lymphomas and standard 
techniques (Click, 1980, Fundamentals of Human 
Lymphoid Culture, Marcel Dekker, New York; Knowies et 
1989, Blood 73, 792-798; Met calf - 1984, Clonal 
Culture of Hematopoeitic Cells: Techniques and 
Aorzl -cations , Elsevier, New York) - 

Fresh lymphoma tissue containing viable infected ceils 
is filtered to form a single ceil suspension. The 
25 cells are separated by Ficoll - Plaque centri f ugat ion 

and lymphocyte layer is removed. The lymphocytes are 
then placed at >lxl0 6 ceils/mi into standard lymphocyte 
tissue culture medium, such as RPMI 164 C supplemented 
with 10% fetal calf serum. Immortalized lymphocytes 
30 containing KSHV are indefinitely grown m the culture 

media while non- immortalized cells die during course 
cf prolonged cultivation. 

Further, KSHV may be propagated in a new cel. line oy 
3 5 removing media supernatant containing the virus from 

a continuously-infected ceil line at a concentration 
of >lxl0 e cells/ml. The media is centrifuges a: 2C00xg 
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for 1C minuses and filtered through a G.45u ri-ter 
remove cells. The media is applied in a 1:1 volume 
with cells growing a: >lx!0* cells/ml for 4 5 hcurs . 
The cells are washed, pelleted and placed m fresr. 
5 culture medium, then tested for KSHV after 14 days. 

KSHV may be isolated from a cell line in the following 
manner. An infected cell line is lysed using standard 
methods, such as hyposmotic shock or Dounce 
IC homogenizaticn or using repeated cycles of freezing 

and thawing in a small volume (<3 mi), and pelleted at 
2000xg for 10 minutes. The supernatant is removed and 
centrifuged again at 10,000xg for 15 minutes to remove 
nuclei and organelles. The resulting low-speed, cell- 
free supernatant is filtered through a 0.45u filter 
and centrifuged at 100,000xg for 1 hour to pellet the 
virus. The virus can then be washed and re -pelleted . 
The DNA is extracted from the viral pellet by standard 
techniques (e.g., phencl/ chloroform) and tested for 
rhe cresence of KSHV by Southern blotting and/or PCR 
using the specific probes described aoove . 



Per banding whole virion, the low-speed cell-free 
supernatant is adjusted to contain 7% PZG-8000. The 

25 PEG-supernatant is spun at 10.000 xc for 30 min . The 

supernatant is poured off and the pellet collected ana 
resuspended in a small volume (1-2 ml) of virus nurrer 
(VB, 0.1 K NaCl. 0.01 K Tris, pH 7.5). The virion are 
isolated by centri f ugat ion at 25,000 rpm in a 10-50% 

30 sucrose gradient made with VE . One mi fractions of 

the gradient are obtained by standard techniques 
(e.g., using a f ract ionator ) and each fraction is 
tested by dot blotting using specific hybridizing 
probes to determine the gradient fraction containing 

3r the purified virus (preparation of the fraction is 

needed m order to detect the presence cf the virus, 
i.e., standard DNA extraction) . 
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The method for isolating the KSHV genome is r,asec cr. 
Peilicer et al., 1976, Cell 14, 133-141 and Gibscr. and 
Roizmann. 1972, J. Virol. 10, 1044-52. 

5 A final me-hod for isolating the KSHV genome is 

-lamped homogeneous electric field (CKZFJ gex 
electrophoresis . Agarose plugs are prepared oy 
resuspending cells infected with KSHV m 1% LMF 
agarose CBiorad) and 0.9% NaCl at 42 °C to a final 

10 concentration of 2.5 x 10 7 ceils/ml. Sclicified 

agarose plugs are transferred into lysis buffer { 0 . 5K 
EDTA pH 8.0, 1% sarcosyl, proteinase K at 1 mg/mi 
final concentration) and incubated for 24 hours. 
Approximately 1C 7 cells are loaded in each lane. C-els 

15 a V e run at a gradient of 6 . 0 V/cm with a run time of 

26 h on a CHEF Mapper XA pulsed field gel 
electrophoresis apparatus (Biorad) , Southern blotted 
and hybridized to KSS31Bam, KS3303am and an E3V 
terminal repeat sequence. 
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To make a new ceil line infected with KSHV , already- 
infected ceils are co- cult ivated with a Raji cell line 
separated by a 0.45u filter. Approximately, 1-2 >: 10< 
already-infected 3CEL-I and 2x10* Raji cells are co- 
cultivated for 2-20 days in supplemented RPKI alone or 
with 20 ng/ml 1 2 - Q - tet radecanoy 1 phcrboi - 13 - acetate 
(TPA) After 2-20 days co-cultivation. Raji cells are 
removed, washed and placed in supplemented RPMI 164 C 
media. A Raji culture co- cult ivated wrtn BCEL-1 m 2C 
ng/mi T?A for 2 days survived and has been kepi m 
continuous suspension culture for >1C weeks. This 
cell line, designated RCC-1 (Raji Co- Culture, No . 1 ) 
remains PCR positive for the KSHV sequence after 
multiple passages. RCC-1 cells periodically undergo 
rapid cytclysis suggestive of lytic reproduction of 
KSHV. Thus, RCC-1 is a Raji cell line newly- mfe cted 
with KSHV. 
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RCC-I and RCC-1 2P , were deposited on October 19 19 54 
under ATCC Accession No. CRL 11734 and CRL 1172 5, 
respectively, pursuant to the Budapest Treaty on the 
International Deposit of Microorganisms for the 
5 Purooses of Patent Procedure with the Patent Culture 

Depository of the American Type Culture Collection, 
12301 Parklawn Drive, Rcckville, Maryland 20852 U.S.A. 
H3L-6 was deposited (as BHL-6; on November IS, 1994 
under ATCC Accession No. CRL 11762 pursuant to the 
.0 Budapest Treaty on the International Deposit of 

Microorganisms for the Purposes of Patent Procedure 
with the Patent Culture Depository of the American 
Type Culture Collection, 12301 . Parkiawn Drive, 
' Rockviiie, Maryland 20e52 U.S.A. 



Hybridization Probes KSHV 



This invention provides a nucleic acid molecule of at 
least 14 nucleotides capable cf specifically 
20 hybridizing with the isolated nucleic acid molecule 

as se; forth in GenBank Accession Numbers "75698, 
U75699, U75700. 

In one embodiment the nucleic acid molecule set forth 
25 m GenBank Accession Number U7569B comprises the long 

unique region (LUR) encoding KSHV polypeptides. In 
another embodiment the nucleic acid molecule set forth 
ir. GenBank Accession Number U75699 comprises the 
prototypical terminal repeat ( TR ) . . In ^ another 
3C embodiment the nucleic acid molecule set fortn m 

GenBank Accession Number U75700 comprises the 
incomplete terminal repeat ( ITR ) . 

In one embodiment the molecule is B to 3£ nucleotides. 

the molecule is 12 to 2= 



In another empociment 
nucleotides. In anoth 
nucleotides . 



her embodiment the molecule is 14 
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In one embodiment the molecule is DNA. In anc:nrr 
embodiment the molecule is RKA . 

In one embodiment the TR molecule contains cis -active 
elements required for DNA replication and packaging, 
in another embodiment the TR molecule is contained in 
a gene -cloning vector. In another embodiment the TR 
molecule is contained in a gene-therapy vector. Jr. 
another embodiment the gene-therapy vector is 
expressed in lymphoid cells. In another embodiment, 
the TR comprises a molecular marker for determining 
the clonality of a tumor. In another embodiment, the 
marker provides a defining feature cf the natural 
history cf a tumor in a diagnostic assay. 
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This invention provides a 3 - iymphot rophi c DNA vectcr 
comprising a piasmid cr other self - repl i cable DNA 
molecule containing the 801 bp KSHV TR cr a portion 
thereof - 

High stringency hybridization conditions are selected 
at about 5 C C lower than the thermal melting point (T p ) 
for the specific sequence at a defined ionic strength 
and pK. The T T is the temperature (under defined ionic 
strength and pK ) a: which 50V of the salt 
concentration is at least about 0.02 molar at pK 7 and 
the temperature is at least about 60°C As other 
factors may significantly affect the stringency of 
hybridization, including, among others, base 
composition and size cf the complementary strands, the 
presence of organic solvents, i.e. salt or fcrmamide 
concentration, and the extent cf base mismatching, the 
combination of parameters is more important than the 
absolute measure of any one. For example. high 
stringency may be attained by overnight hybridization 
at about 6B°C m a 6X SSC solution, washing at room 
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temperature with 6X SSC solution, followed by washing 
at about 68 °C in a 0 . 6X SSC solution. 

Hybridization with moderate stringency may be attained 
for example by: 1> filter pre-hybridizing and 
hybridizing with a solution of 3X SSC, 50% fcrmamtde, 
o\ 1M Tris buffer at pK 7.5, 5X Denhardt ' s solution; 
2.) pre-hybridization at 37°C for 4 hours; 3) 
hybridization at 37°C with amount of labeled probe 
equal to 3,000,000 cpm total for 16 hours; 4'- wash m 
x^SSC and 0.1% SDS solution; 5) wash 4X for 1 minute 
each at room temperature in 4X SSC at 60°C rcr 30 
minutes each; and 6) dry and expose to film. 
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Nucleic acid probe technology is well known :c those 
skilled in the art who readily appreciate that such 
r-robes may vary greatly in length and may be labeled 
with a detectable label, such as a radioisotope or 
fluorescent dye, to facilitate detection of the prone . 
DNA probe molecules may be produced by insertion of a 
DNA molecule having the full- length or a fragment or 
the isolated nucleic acid molecule ct the DNA virus 
into suitable vectors, such as plastics or 
bacteriophages, followed by transforming into suitaoie 
bacterial host ceils, replication in the transformed 
bacterial host cells and harvesting of the DNA probes, 
usincr methods well known m the art. Alternatively, 
probes may be generated chemically from DNA 
synthesizers. 

RNA orobes may be generated by inserting tne ---- 
length or a fragment of the isolated nucleic acid 
molecule of the DNA virus downstream zz a 
bacteriophage promoter such as 72 , 77 or s^c . ^arge 
amounts of RNA probe may be produced by incubating tne 
labeled nucleotides with a linearized isolated nucleic 

, - ^ ■ ^ n ,-A n ,e c>- - - <= fraament where 
acid mo.ecule or tne DN~ \~rus _ 
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it contains an upstrear. promoter in the presence c_ 
the appropriate RNA polymerase - 

As defined herein nucleic acid probes may be DNA cr 
5 RNA fragments. DNA fragments can be prepared, fcr 

example, by digesting piasmid DNA, or by use cr ?CR, 
cr synthesized by either the phosphcramidite met nod 
described by Beaucage and Carruthers, ISS1. 
Tezranedron Lezz. 22, 1859-1662 or by the tnester 
10 method according to Matteucci ez al . , 1931, Air,. Caem. 

Soc. 103:3185. A double stranded fragment may then be 
obtained, if desired, by annealing the chemically 
synthesized single strands together under appropriate 
conditions or by synthesizing the complementary strand 
15 using DNA polymerase with an appropriate primer 

sequence. Where a specific sequence for a nucleic 
acid probe is given, it is understood -hat the 
complementary strand is also identified and included. 
The complementary strand will work equally well in 
situations where the target is a double - stranded 
nucleic acid. It is also understood thai when a 
specific sequence is identified for use a nucleic 
crobe, a subsequence of the listed sequence which is 
25 base pairs {b?J or more in length is also 
encompassed for use as a probe. 
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^ r the subiect invent 10: 



The nucleic acid molecules c_ 
also include molecules coding fcr polypeptide analogs, 
fragments cr derivatives of antigenic polypeptides 
which differ from natural iy- occurring fcrms in terms 
of the identity or location of one cr more amino acid 
residues (deletion analogs containing less than all or 
the residues specified fcr the polypeptide, 
substitution analogs wherein one cr more residues 
specified are replaced by other residues and addition 
analogs where m one cr more amino acid residues is 
added to a terminal cr medial portion of tne 
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polypeptides) and which share some or all properties 
of naturally-occurring forms. These molecules 

include: the incorporation of codons "prsfsrrec" ror 
expression by selected non-mammalian hosts; the 
5 provision of sites for cleavage by restriction 

endonuclease enzymes; and the provision cf additional 
initial, terminal or intermediate DNA sequences that 
facilitate construction of readily expressed vectors. 



10 



15 



20 



2 D 



c . Polypeptides of KSHV and Antibodies 
fah's) Thereto 

This invention provides an isolated KSKV polypeptide, 
one from the list as set forth in Table 1 and below. 

This invention provides the isolated KSKV polypeptide 
comprising viral macrophage inflammatory protein III 
ivMIP-IID. In one embodiment, vMIP-"I comprises an 
orohan cytokine. In another embodiment, vMIP-III is 
encoded by nucleotides 22,529-22,165. In another 
embodiment. vKIP-IH comprises an ant i - inflammatory 
druc. In a preferred embodiment, the drug is userui 
in treatment of an autoimmune disorder. In tne most 
preferred embodiment , the drug is useful m treatment 
of rheumatoid arthritis. 



This invention provides the isolated KSKV polypeptide 
comprising dihydrof oiate reductase CDKFR) encoded J>y 
ORF 2. In one embodiment, DKFR participates m KSKV 
3 r nucleotide synthesis. In another embodiment . DKFR 

comprises an enzyme essential for viral rep.ica _ *on . 
inhibition cf which prevents virus production. In 
another embodiment, DHFR comprises a subunit vaccine. 
In another embodiment, DKFR comprises an antigen tor 
immunologic assays. 
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In another embodimen: . DHFR has the amine 
sequence as set forth in SEC ID NO : 1 - 

In another embodiment, KSHV DHFR is inhibited by a 
5 sulfa drug known to inhibit bacterial DHFR. In a 

preferred embodiment, KSHV DHFR is inhibited by 
methotrexate or a derivative thereof known tc inhibit 
mammalian DHFR. . In another embodiment, the sulfa 
druo, methotrexate or a derivative thereof is 
10 selective among the human herpesviruses for mnioition 

of KSHV. 

This invention provides the isolated KSHV polypeptide 
comprising thymidyiate synthase (TS) encoded by ORF 

15 70. In one embodiment, TS participates in KSHV 

nucleotide metabolism. In another embodiment, TS 
comprises an enzyme essential for viral replication, 
inhibition of which prevents virus production, 
anctner embodiment. TS comprises a suour.it vaccine. 

20 In anctner embodiment, TS comprises an antigen for 

immunologic assays. 

This invention provides the isolated KSHV polypeptide 
comprising DNA polymerase encoded by ORF 9 . In one 
embodiment , DNA polymerase comprises an enzyme 
essential for viral replication , inhibition c: wnicn 
nrevents virus production. In another embodiment, DNA 
polymerase comprises a subunit vaccine. In anctner 
embodiment, DNA polymerase comprises an antigen tor 
30 immunologic assays. 

This invention provides the isolated KSHV polypeptide 
comprising alkaline exonuciease encoded by ORF 37. In 
one embodiment, alkaline exonuciease packages KSHV DNA 
3 5 into the virus particle. In another embodiment, 

alkaline exonuciease comprises an enzyme essential ror 
viral replication, inhibition cf which prevents virus 



2r> 



In 



PCTAJS97/13346 

WO 98/04576 



30 



27 

production . 



Ir. another embodiment. aik^-ir.: 



exonuclease comprises a subumt vaccine. In anotner 
embodiment, alkaline exonuclease comprises an antigen 
for immunologic assays. 

This invention provides the isolated KSKV polypeptide 
comprising helicase-primase, subunits I, 2 and 3 
encoded by ORFs 40, 41 and 44, respectively. In one 
embodiment . helicase-primase comprises an enzyme 

10 activity essential for viral DNA replication. In 

another embodiment, helicase-primase is inhibited by 
nucleotide analogs. In another embodiment, helicase- 
primase is inhibited by known antiviral drugs. In 
another embodiment, inhibition of helicase-primase 

15 prevents KSHV replication. 

This invention provides the isolated KSKV polypeptide 
comprising uracil DNA glycosylase (UDG) encoded by ORF 
46. In one embodiment, uracil DNA glycosylase 

20 comprises an enzyme essential for KSHV DNA repair 

during DNA replication. In another embodiment, uracil 
DNA glycosylase is inhibited by Known antiviral drugs, 
in another embodiment. uracil DNA glycosylase 
comprises a subunit vaccine. In another embodiment, 

25 uracil DNA glycosylase comprises an antigen for 

immunologic assays . 

This invention provides the isolated KSKV polypeptide 
comprising single - stranded DNA binding protein iS33?) 
encoded by ORF 06. In one embodiment, SSE? comprises 
an enzyme essential fcr KSKV DNA replication. In 
another embodiment, SSB? is inhibited by known 
antiviral drugs. In another embodiment, SSBF 

increases the processivity of polymerase reactions 
such as in the conventional ?CR method for DNA 
amolif icat ion . 



35 
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This invention provides the isolated KSKV pc-ypectiae 
comprising viral protein kinase encoded by OR? 3= . In 
another embodiment, viral protein kinase comprises ar. 
antigen for immunologic assays. in anztne: 

5 embodiment, viral protein kinase comprises a subunit 

vaccine . 

This invention provides the isolated KSKV polypeptide 
comprising lytic cycle transact ivator protein ;1CT?) 

10 encoded by ORF 50. In one embodiment, LOT? is 

required for activation of productive infection from 
the latent state. In another embodiment, LCTTF is 
inhibited by known antiviral drugs. In anctner 
embodiment, prevention of LCTP expression maintains 

- 5 rhe virus m a latent state unable tc replicate. 

This invention provides the isolated KSHV polypeptide 
comprising ribonucleotide reductase, a twc-suounit 
enzyme in which the small and large subur.it s are 
20 encoded by ORF 60 and ORF 61, respectively. In 

another embodiment, ribonucleotide reductase catalyzes 
conversion of ribonucleotides into 

deoxyribonucieotides for DNA replication. In another 
embodiment, ribonucleotide reductase is inhibited by 
25 known antiviral drugs in terminally differentiated 

cells not expressing cellular ribonuciect iae 
reductase. In another embodiment, ribonuciect ide 
reductase comprises an antigen for immunologic assays. 
In another embodiment, ribonucleotide reductase 
3C comprises a subur.it vaccine. In another embodiment, 

ribonucleotide reductase comprises a trans terming 
aaent for establishment of immcrtalizec ceix _ir.es. 

This invention provides the isolated KSKV polypeptide 
35 comorising the protein encoded by ORF KI . 
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This invention provides the isolated KSHV polypeptide 
comprising complement -binding protein (v-CBP: CCr ; 
encoded by ORF 4 . 

5 This invention provides the isolated KSHV polypeptide 

comprising transport protein encoded by ORF 7. 

This invention provides the isolated KSHV polypeptide 
comprising glycoprotein B encoded by ORF 6. 

1 c 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 10. 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 11. 

This invention provides the isolated KSHV polypeptide 
comprising viral interieukm 6 (vIL-6) encoded by ORF 
K2 . In one embodiment , antibodies selectively 

recognizing vIL-6 allow differentiation among 
lymphomas. 

This invention provides the isolated KSHV polypeptide 
comprising 3HV4-IF1 I encoded by ORF K2 . 
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This invention provides the isolated KSHV polypeptide 
comprising vMI?-II encoded by ORF K4 . ■ In one 
embodiment, vMIP-II comprises an anti - inflammatory 
drug. in a preferred embodiment, the drug is useful 
in treatment of an autoimmune disorder. In the most 
oref erred embodiment, the drug is useful m treatment 
of rheumatoid arthritis. 

This invention provides the isolated KSHV pciypeptiae 
comprising 3HV4-IE1 I I encoded by ORF K= 
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This invention provides the isolated KSHV polypeptide 
comprising vMIP-I encoded by ORF K€ - Ir. one 

embodiment, vMI?-I comprises an anti-inflammatory 
drug. In a preferred embodiment, the drug is userui 
5 in treatment of an autoimmune disorder. In tne most 

preferred embodiment, the drug is useful in treatment 
cf rheumatoid arthritis. 

This invention provides the isolated KSHV polypeptide 
10 comprising the protein encoded by ORF K7 . 

This invention provides the isolated KSHV polypeptide 
comprising Bcl-2 encoded by ORF 16. 
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This invention provides the isolated KSHV polypeptide 
comprising capsid protein I encoded by ORF 17. 

This' invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 1£. 

This invention provides the isolated KSHV polypeptide 
comprising tegument protein I encoded by ORF 19. 



This invention provides 



-he isolated KSHV polypeptide 



25 comprising the protein encoded by Or-- 2:. 

This invention provides the isolated KSHV polypeptide 
comprising thymidine kinase encoded by ORF 21. 

30 This invention provides the isolated KSHV polypeptide 

comorismg glycoprotein H encoded by ORF 22. 



Ir. one embodiment, tne 



isolated KSHV polypeptide 



comprises the protein encoded by ORF 23. 

This invention provides the isolated KSHV pciypeptiae 
comprising the protein encoded by OF.F 24. 
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This invention provides the isolated KSHV pc_yp=F-— e 
comprising ma 3 or capsid protein encoded by ORF 25. 

This invention provides the isolated KSHV polypeptide 
5 comprising capsid protein II encoded by ORF 25. 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 27. 



10 



This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 28. 

Thi5 invention provides the isolated KSHV polypeptide 
comprising packaging protein II encoded by ORF 29b. 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by" ORF 30. 

This invention provides tne isolated KSHV polypeptide 
comprising the protein encoded by ORF 31. 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 32. 

25 This invention provides the isolated KSHV polypeptide 

comprising . the protein encoded by ORF 33. 

This invention provides the isolated KSHV polypeptide 
comprising packaging protein I encoded by ORF 29a. 
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This invention provides the isolated KSHV polypeptide 
comprising tne protein encoded by ORF 34 . 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 35. 
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This mventicr. provides the isolated KSHV ?ci;.-per- 1 c_ 
comprising the protein encoded by ORF 3E. 

This invention provides the isolated KSHV po.;.-pep::ue 
5 comprising glycoprotein K encoded by ORF 39. 

This invention provides the isolated KSHV pciypepr id- 
comprising the protein encoded by ORF 42. 

10 T 
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his invention provides the isolated KSHV polypeptide 



comon 



sing capsid protein III encoded by ORF 43. 



This invention provides the isolated KSHV polypeptide 
comprising virion assembly protein encoded by ORF 45. 

This invention provides the isolated KSHV polypeptide 
comprising glycoprotein L encoded by ORF 47. 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 46. 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 49. 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF KB. 

This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 52. 

This invention provides the isolated KSHV pclypeptide 
comprising the protein encoded cy ORF ~ 2 . 

This invention provides the isolated KSHV polypeptide 
comprising dUTRase encoded by ORF 54 . 
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This invention provides the isolated KSHV polypeptide 
comprising the protein encoded by ORF 55. 



This invention provides the isolated KSKV polypeptide 
5 comprising DNA replication protein I encoded by ORF 

56 . 

This invention provides the isolated KSHV polypeptide 
comprising immediate early protein II ( I E ? - 1 1 ) encoded 
1 o by OR.F 5 7 . 

This invention provides the isolated KSKV polypeptide 
comprising viral interferon regulatory factor 1 
'vIRFl; ICSEP) encoded by ORF K9 . In one embodiment. 
vIRFl is a transforming polypeptide. 

This invention provides the isolated KSHV polypeptide 
-omcr:s:nc the protein encoded by ORF K1G - 

20 This invention provides the isolated KSHV polypeptide 

comprising the protein encoded by ORF Kll - 

This invention provides the isolated KSHV polypeptide 
comprising. phosphoprotein encoded by ORF 58. 

2 5 

This invention provides the isolated KSKV polypeptide 
comprising DNA replication protein ZZ encoded by ORF 
59 . 

30 This invention provides tne isolated KSKV polypeptide 

comprising assembly/ DNA maturation protein encoded by 
ORF £ 2 . 

This invention provides the isolated KSHV polypeptide 
35 comprising tegument protein II encoded by ORF €3. 
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This invention provides the isolated KSKV polypeptide 
comprising tegument protein XII encoded by. OR? 6-1. 

This invention provides the isolated KSKV polypeptide 
5 comprising capsid protein IV encoded by OR? 65. 

This invention provides the isolated KSKV polypeptide 
comprising the protein encoded by ORF 66. 
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This invention provides the isolated KSKV polypeptide 
comprising tegument protein IV encoded by ORF 67. 

This invention provides the isolated KSHV polypeptide 
comprising glycoprotein encoded by OR? 68. 

This invention provides the isolated KSKV polypeptide 
comprising the protein encoded by ORF 65. 

This invention provides the isolated KSKV polypeptide 
comprising Kaposin encoded by ORF K12 . 

This invention provides the isolated KSKV polypeptide 
comprising the protein encoded by ORF K13 . 

This invention provides the isolated KSKV polypeptide 
comprising cyciin D encoded by ORF 72. 

This invention provides the isolated KSKV polypeptide 

TTvcrismc: immediate - ear ly protein (IE?) encoded by 



CO 

3 0 ORF 73. 



This invention provides the isolated KSKV polypeptide 
comprising OX -2 encoded by ORF K14 . 



isolated KSKV polypeptide 
omprising G-protein coupled receptor encoded by ORF 



This invention provides 
c 



74 . 
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This inversion provides the isolated KSKV pciypertice 
comprising tegument protein / FGARAT encoded by ORF "5. 

This invention provides the isolated KSKV polypeptide 
comprising the protein encoded by OR" K15 . 



This invention provides the isolated KSKV polypeptide 
comprising viral interferon regulatory factor 2 
(vIRF2) encoded by nucleotides 88,910-86,410. 



This invention provides the isolated KSKV polypeptide 
comprising viral interferon regulatory factor 3 
fvIR?3> encoded by nucleotides 90.541-85,600. 

This invention provides the isolated KSKV polypeptide 
comprising viral interferon regulatory factor 4 
ivIRF4) encoded by nucleotides 94 , 127 - 53 , 6 3 £ . 



This invention provides the isolated KSKV polypeptide 
comprising a precursor cf secreted glycoprotein X igX) 
encoded by nucleotides 50,173-50,643. 

ention provides the isolated KSKV polypeptide 



This inv 
coxpr isi 
nucleotides 26,661-29,741. 



comprising protein Tl . 1 <nut-l> encoded oy 



Further, the isolated polypeptide may be imkec to a 
second polypeptide to form a fusion protein by linking 

30 the isolated nucleic acid molecule to a second nucleic 

acic molecule and expression in a suitable nest ---- • 
'in one embodiment the second nucleic acid molecule 
encodes beta-galactosidase . Other nucleic acic 

molecules which are used to form a fusion protein are 

35 known tc those skilled in the art. 
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This invention provides an antibody which specifically 
binds to the polypeptide encoded by the isolates 
nucleic acid molecule. In cne embodiment the anticcdy 
1£ a monoclonal antibody. In another embodiment :r.e 
5 antibody recognizes an epitope of tne KSHV 

oolypeotide . In another embodiment the anticooy is a 
polyclonal antibody. In another embodiment -he 
antibody recognizes more than one epitope cf the KSKV 
colypeptide. In another embodiment the antibody is an 
10 ant i - idiotypic antibody. 

An antibody, polypeptide or isolated nucleic acid 
molecule may be labeled with a detectable marker 
including, but not limited to: a radioactive label, or 
n s a cclcnmetric, a luminescent, or a fluorescent 

marker, or gold. Radioactive labels include, but are 
not limited to: H, u, f , r, s , ^- , 

t - 2h i , - y 'I, and lbo Re . Fluorescent markers 
include. but are not limited to: fluorescein, 
rhodamme and auramine . Coiorimetric markers include, 
cut are not limited to: biotin, and digoxigenm . 
Methods of producing the polyclonal or monoclonal 
ar - :_bcdv are known to those of crcmary sk;-. • in tne 
art . 

Further, the antibody, polypeptide or nucleic acid 
molecule may be detected by a second antibooy wnicn 
may be linked to an enzyme, such as alkaline 
Phosphatase or horseradish peroxidase. Other enzymes 
which may be employed are well known tc cne or 
ordinary skill in the art. 

This invention provides a method cf producing a 
oolvpeptide encoded by the isolated nucleic acid 
molecule, which comprises growing a host -vector system 
under suitable conditions permitting production c: tne 
polypeptide and recovering the polypeptide so 
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croduced. Suitable host cells include bacteria, 
yeast, filamentous fungal, plant, insect and mammaliar. 
cells. Host -vector systems for producing ar.c 

recovering a polypeptide are well known :c those 
5 skilled in the art and include, but are not United 

to, E. coli and pMAL (New England Bioiabs) . the Sz9 
insect cell-baculovirus expression system, and 
mammalian cells (such as HeLa, CCS , NIK 2- and 
HSK2S3) transfected with a mammalian expression vector 
10 by Lipofectin (Gibco-3RL) or calcium phosphate 

precipitation or other methods to achieve vector entry 
into " the cell . Those of skill in the art are 
knowledgeable in the numerous expression systems 
available for expression of KSHV polypeptide. 

-i Z 

This invention provides a method to select specific 
regions or. the polypeptide encoded by the isolated 
nucleic acid molecule of the DNA virus to generate 
antibodies. Ammo acid sequences may be analyzed by 
20 methods well known to those skilled in the art to 

determine whether they produce hydrophobic or 
hvdrcphilic regions in the polypeptides which they 
build. In the case of a cell membrane polypeptide, 
hydrophobic regions are well known to form the part or 
25 tne polypeptide that is inserted mtc the lipid 

bilayer of the cell membrane, while hydrophilic 
regions are located on the cell surface, m an aqueous 
environment. Usually, the hydrophilic regions will be 
m ore immunogenic than the hydrophobic regions. 
Therefore the nydrophilic amine acid sequences may be 
selected and used to generate antibodies specific to 
polypeptide encoded by the isolated nucleic acid 
molecule encoding the DNA virus. The selected 
peptides may be prepared using commercially available 
machines. As an alternative, nucleic acid may be 
cloned and expressed and the resulting polypeptide 
recovered and used as an immunogen . 
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Polyclonal antibodies against the polypeptide r.sy be 
produced by immunizing animals using a selected :-:SK\ 
polypeptide. Monoclonal antibodies are prepared using 
hybridoma technology by fusing antibody producing r 
5 cells from immunized animals with myeloma ceils ana 

selecting the resulting hybridoma cell line producing 
the desired antibody, as described further below. 
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II . "Tr.nunoassavs 

The antibodies raised against: KSKV p::v-pep::ae 
an r igen £ may be detectably labeled, utilizing 
conventional labelling techniques well-known to the 
art, as described above. 

In addition, enzymes may be used as labels. Suitable 
enzymes include alkaline phosphatase, beta- 
caiactosidase, glucose- 6 -phosphate dehydrogenase, 
maleate dehydrogenase and peroxidase. Two principal 
types of enzyme immunoassay are the enzyme - linked 
immunosorbent assay (ELISA) , and the homogeneous 
enzyme immunoassay, also known as enzyme-multiplied 
immunoassay (EMIT , Syva Corporation, Palo Alto, CA) . 
In the ELISA system, separation may be achieved, for 
example, by the use of antibodies coupled to a solid 
ohase . The EMIT system depends on deactivation c: the 
enzyme m the tracer- antibody complex; activity is 
20 thus measured without the need for a separation step. 
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Additionally, chemilummescent compounds may be usee 
as labels. Typical chemi luminescent compounds include 
luminol, isoluminci , aromatic acricimum esters, 
25 imidazoles. acricinium salts, and oxalate esters. 

Similarly, bioluminescent compounds may be utilized 
for labelling, the bioluminescent compounds including 
iuciferin, lucif erase, and aequorin. 



A description of a radioimmunoassay (RIA) may be rounc 
in: Laborazory Techniques in Bicchemzszry and 
Molecular Biology (1978) North Holland Publishing 
Company, New Y Qrk , wi - h particular reference to the 
chapter entitled "An Introduction to Radio immune Assay 
and* Related Techniques" by T. Chard. A description cf 
aeneral immunometric assays of various types can be 



WO 98/04576 



PCT/US97/13346 



found in the following U.S. Pa-. Nos . 4.376.11C 
e: ai . J cr 4,098,876 (Fiasio) . 



A. 



Assays for KSHV Polypeptide Antigens 



One can use immunoassays ro detect the virus, its 
components, cr antibodies thereto. A general overview 
of the applicable technology is in Harlow and Lane 
(1988) Antibodies, A Laboratory Manual, Cold Spring 
10 Harbor Publication, New York. 

In one embodiment, antibodies to KSHV polypeptide 
antigens can be used. In brief, to produce 

antibodies, the polypeptide being targeted is 

15 expressed and purified. The product is injected into 

a mammal capable of producing antibodies. Either 
polyclonal or monoclonal antibodies (including 
recombinant antibodies) specific for the gene product 
can be used in various immunoassays. Such assays 

20 include competitive immunoassays, radioimmunoassays, 

Western blots, ELISA, indirect immunof luorescent 
assays and the like. For competitive immunoassays, 
see Harlow and Lane at pages 55*7-573 and 534-569. 

25 Monoclonal antibodies cr recombinant antibodies may be 

obtained by techniques familiar to those skinec m 
the art. Briefly, spleen ceils or other lymphocytes 
from an animal immunized with a desired antigen are 
immortalized, commonly by fusion with a myeloma cell 

3 0 (see, Kohier and Kiistein, 19*76, Eur. J. Inununo- . c, 

511-519) . Alternative methods of immortalization 
include transformation witn Epstein 3arr Virus, 
oncogenes, or retroviruses, or other methods well 
known in the art. Colonies arising from single 

35 immortalized cells are screened for production of 

antibodies of the desired specificity anc aztmity tor 
the antigen, and yield of the monoclonal antibodies 
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produced by such cells may oe enhanced by various 
techniques, including m]ection into the peri:=nesl 
cavity of a vertebrate host. Newer techniques using 
recombinant phage antibody expression systems can aisc 
be used to generate monoclonal antibodies. See. for 
example: McCafferty et al . (199C) Nature 346, 552; 
Hoogenboom e: ai - (1991) Wuc, Acids J?es. 19, 4133 ; and 
Marks er al . (1991) J . Mol Biol. 222 , 5B1-597. 

Methods for characterizing naturally processed 
peptides bound to MHC (major histocompatibility 
complex) I molecules can be used. See Falk ez al . , 
1991, Nature 351, 290 and PCT publication N: . WC 
92/21033 published November 26, 1992. Typically, 
these methods involve isolation of MHC c_ass i 
molecules by immunoprecipitat ion cr af unity 
chromatography from an appropriate ceil or cell -me. 
Other methods involve direct amino acid sequencing of 
che more abundant peptides in various HPLC fractions 
by known automatic sequencing of peptides eluted from 
Class I molecules of the B cell type (Jardetzkey er 
al-, 1991, Nature 353, 326], and of the human MHC 
class I molecule, HLA-A2.1 type by mass spectrometry 
(Hunt et al., 1991, Eur. J. Immunol. 21, 2963-2970). 
See also, Rotzschke and Falk, 1991, Immunol. Today 12, 
447 r cr a aenerai review of the characterization or 
naturallv crocessed peptides in MHC class I- Further, 
Marloes et al . , 1991, Eur. J. Immunol . 21. 2963-2970, 
describe how class I binding motifs can be applied to 
che identif ication of potential viral immunogenic 
oeotides in vitro. 

The polypeptides described herein produced by 
recombinant cechnoiogy may be purified by standard 
techniques well known to those of skill in the art. 
Recombmantly produced viral polypeptides can be 
directly expressed or expressed as a fusion protein. 
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The pro-em is then purified py a combination of cel. 
lysis (e.g., sonica^ion) and affinity chromatography. 
For fusion produces, subsequent digestion of the 
fusion protein with an appropriate proteolytic enzyme 
5 releases the desired peptide. 

The polypeptides may be purified to substantia, purity 
by standard techniques well known in the art, 
including selective precipitation with such substances 
10 as ammonium sulfate, column chromatography, 

immunopurif ication methods, and others. See, for 
instance, Scopes, 19B2, Protein Purification: 
Principles and Practice, Springer -Veriag , New York. 

15 B . Assays for Ar.tibori^fi Spec ifically Binding 

To KSHV Polypeptides 

Antibodies reactive with polypeptide antigens of KSHV 
can also be measured by a variety of immunoassay 

20 methods that are similar to the procedures described 

above for measurement cf antigens. For a review or 
immunological and immunoassay procedures applicable to 
the measurement of antibodies by immunoassay 
techniques, see Basic and Clinical Immunology, 7th 

25 Edition, Stites and Terr. Eds., and Harlow and Lane , 

19 88, Antibodies , A Laboratory Manual, Cold Spring 
Harbor, New York. 
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In brief, immunoassays to measure antibodies reactive 
with polypeptide antigens cf KSHV can be either 
competitive cr noncompetitive binding assays. In 
competitive binding assays. the sample analyte 
competes with a labeled analyte for specific binding 
sites on a capture agent bound to a solid surface. 
Preferably the capture agent is a purified recombinant 
human herpesvirus polypeptide produced as described 
above. Other sources cf human herpesvirus 
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polypeptides, including isolated or partially punnec 
"naturally occurring polypeptide, may also be usee. 

Noncompetitive assays are typically sandwich assays. 

5 in which the sample analyte is bound between two 

analyte-specific binding reagents. One cf the binding 
agents is used as a capture agent and is bound to a 
solid surface. The second binding agent is labeled 
and is used to measure or detect the resultant coir.pl ex 

to by visual or instrument means. A number of 

combinations of capture agent and labeled binding 
agent can be used. A variety of different immunoassay 
formats, separation techniques and labels can also be 
used similar to those described above for the 

15 measurement of KSHV polypeptide antigens. 

Hemagglutination Inhibition (HI ) and Complement 
Fixation (CFi are two laboratory tests that can be 
used to detect infection with human herpesvirus by 
20 testing for the presence of antibodies against the 

virus or antigens of the virus. 



25 



Serological methods can also be useful when one wisnes 
to detect antibody to a specific viral variant. For 
example, one may wish to see how well a vaccine 
recipient has responded to a new preparation oy assay 
of patient sera. 



WO 98/04576 



PCT/US97.' 13346 



10 



44 

t xa - Vg^tsr, Cell Line and Transgenic Mammal 

This invention provides a replicable vector ccr.cain3.no 
r _he isolated nucleic acid molecule encoding a KSHV 
polypeptide. The vector includes, but is not Itmitec 
*ro: a plasmid, cosmid, X phage or yeast artificial 
chromosome ( YAC) which contains the isolated nucleic 
acid molecule. 

To obtain the vector, for example, insert and vector 
DNA can both be exposed to a restriction enzyme zo 
create complementary ends on both molecules which base 
pair with each other and are then ligated together 
with DNA ligase. Alternatively, linkers can be 
15 ligated to the insert DNA which correspond to a 

restriction site in the vector DNA, which is then 
digested with the restriction enzyme which cuts at 
-hat site. Other means are available and well -known 
to those skilled in the art. 

"'his invention provides a host cell containing the 
vector. Suitable host ceils include, but are not 
limited to, bacteria tsuch as E . coli) , yeast, fungi, 
plant, insect and mammalian ceils. Suitable animal 
25 "cells include, but are not limited to Vero cells, HeLa 

cells, Cos cells, CVI ceils and various primary 
mammalian ceils. 



This invention provides a transgenic nonhuman mammal 
30 which comprises the isolated nucleic acid molecule 

introduced into the mammal at an embryonic stage - 
Methods of producing a transgenic nonhuman mammal are 
known to those skilled in the art. 
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XII . n-.aemost ^ Assays for K£ 

This invention embraces diagnostic test >:it = - C1 
detecting the presence of KSHV in biological samples, 
such as skin samples or samples cf other affected 
tissue, comprising a container containing a nucieic 
acid sequence specific for a KSHV polypeptide and 
instructional material for performing the test. A 
container containing nucleic acid primers tc any one 
of such sequences is optionally included. 

This invention further embraces diagnostic test kits 
for detecting the presence of KSHV in biological 
simples, such as serum or solid "issue samples, 
comprising a container containing antioodies to a KSHV 
polypeptide, and instructional material for performing 
the test. Alternatively, inactivated viral particles 
or polypeptides derived from tne human herpesvirus may 
be used in a diagnostic test kit to detect antibodies 
specific for a KSHV polypeptide. 

A . Nucleic Acid Assavs 

This invention provides a method of diagnosing 
Kaposi's sarcoma in a subject which comprises: (a) 
obtaining a nucleic acid molecule from a tumor lesion 
or a suitable bodily fluid cf the subject; Cb)- 
contacting the nucleic acid molecule with a iabeiea 
nucleic acid molecule of at least 15 nucleotides 
capable of specifically hybridizing with the isclated 
nucleic acid molecule of KSHV under hybridizing 
conditions; and (c) determining the presence of the 
nucleic acid molecule hybridized, the presence of 
which is indicative cf Kaposi's sarcoma m the 
subject, thereby diagnosing Kaposi's sarcoma m tne 
subiecc . 
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in one embodiment the nucleic acid molecule frc~ cr.e 
tumor lesion is amplified before step (b) . Ir. antther 
embodiment tne polymerase chain reaction [PC?. is 
employed to amplify the nucleic acid mciecule. 
5 Methods of amplifying nucleic acid molecules are -nown 

tc those skilled in the art. 

A person of ordinary skill in the art will r>e able ::■ 
obtain appropriate nucleic acid sample for diagnosing 
10 Kaposi's sarcoma in the subject. The DNA sample 

obtained by the above described method may be cleaved 
by restriction enzyme before analysis, a technique 
well-known m the art. 

-15 In the above described methods, a size fractionation 

may be employed which is effected by a poiyacrylamide 
gel. In one embodiment, the size f ractionat icn is 
effected by an agarose gel. Further, transferring the 
nucleic acid fragments into a solid matrix may be 
emcloyed before a hybridization step. One example of 
such solid matrix is nitrocellulose paper. 

This invention provides a met hoc or detecting 
expression of a KSHY gene m a cell which comprises 
obtaining mRNA from the cell, contacting the mRNA 
with a labeled nucleic acid molecule cf r'SHV under 
hybridizing conditions, determining the presence of 
mRNA hybridized to the molecule, thereby detecting 
expression of the KSKV gene. In one embodiment cDKA 
is prepared from the mRNA obtained from the cell and 
used to detect KSHV expression. 

Accepted means for conducting hybridization assays are 
known and general overviews of the technology can oe 
had from a review of: Nucleic Acid Hybridi canon : A 
Practical Approach (1S55) Hames and Higgins, Eds., • 
Press; Hybridization of Nucleic Acids Immobilize:: or. 
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Solid Supports, Meir.koth and Wahi ; Analyziza. 
Biochemistry (1984) 238. 267-284 and Innis et al . . P=R 
Protocols !1990) Academic Press, San Diego. 

. Ta - a »- -specific crobes may be used in the nucleic acic 
hybridization diagnostic assays for KS . The probes 
are specific for or complementary to the target of 
interest. For precise allelic differentiations, the 
probes should be about 14 nucleotides long anc 
10 preferably about 20-30 nucleotides. For more general 
detection of KSHV. nucleic acid probes are about 50 to 
1000 nucleotides, most preferably about 200 tc 400 
nucleotides . 

A specific nucleic acid probe can be RNA. DNA, 
oligonucleotide, or their analogs. The probes may be 
single or double stranded nucleic acid molecules. The 
probes of the invention may be synthesized 
enzytr.aticaliy.- using methods well known ir. the art 
2G '(e.g., nick translation, primer extension, reverse 

-ranscription, the polymerase chair, reaction, and 
others) or chemically (e.g.. by methods described by 
Beaucaae and Carruthers or Matteucci e- al . . supra). 
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The probe must be of sufficient length to be able tc 
form a stable duplex with its target nucleic acid in 
the samoie, i.e., at least about 14 nucleotides, and 
may be longer (e.g., at least about 50 or 100 bases in 
length) . Often the probe will be more than about 10 0 
bases m length. For example, when probe is prepared 
by nick-translation of DNA in the presence of labeled 
nucleotides the average probe length may be about 100- 
600 bases. 

For discussions of nucleic acid probe design and 
annealing conditions see, for example, Ausubei er al . , 
suora; Berger and Kimmel , Eds., Methods in Enzymolcgy 
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Vol. 152. (1967) Academic Press, New York ; cr 
Hybridization with Nucleic Acid Probes., pp. 495-524. 
(1993) Elsevier, Amsterdam. 

5 Usually, at least a part of the probe will have 

considerable sequence identity with the target "ucieic 
acid. Although the extent of the sequence identity 
required for specific hybridization will depend on the 
length of the probe and the hybridization conditions, 
1Q che p ro be will usually have at least 70% identity to 

the target nucleic acid, more usually at least 80% 
identity, still more usually at least 90% identity and 
most usualiv at least 95% or 100% identity. 
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The following stringent hybridization and washing 
conditions will be adequate to distinguish a specific 
probe (e.g., a fluorescent ly labeled nucleic acid 
probe) from a probe that is not specific: incubation 
of the probe with the sample for 12 hours at 37°C m 
20 a solution containing denatured probe, 50% f ermamiae, 

2X SSC, and C . 1 % (w/v) dextran sulfate, followed by 
washing in IX SSC at 70°C for 5 minutes; 2X SSC at 
3 7 o C f cr 5 minutes; 0 . 2X SSC at room temperature for 
5 minutes, and H r O at room temperature for 5 minutes. 
25 Those of skill are aware that it will often be 

advantageous in nucleic acid hybridizations (i.e., zn 
situ, Southern, or Northern) to include detergents 
(e.g., sodium dodecyl sulfate), chelating agents 
(e.g., EDTA) or other reagents (e.g., butters, 
30 Denhardt's solution, dextran suit ate I m tne 

hybridization cr wash solutions. To evaluate 

specificity, probes can be tested cr. host ceils 
containing KSHV and compared with the results from 
ceils containing ncn-KSKV virus. 



35 



It will be apparent to those of ordinary skill m the 
art that a convenient method for determining whether 
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a probe is specific for a KSHV nucleic acid molecule 
utilizes a Southern blot (or Dot blot) using DKA 
prepared from tne virus. Briefly, to identify a 
target-specific probe, DNA is isolated from the virus. 
5 Test DNA, either viral or cellular, is transferred tc 

a solid (e.g., charged nylon) matrix. The probes are 
labeled by conventional methods. Following 
denaturation and/or prehybridizatior. steps known , in 
the art, the probe is hybridized to the immobilized 
10 DNAs under stringent conditions, such as defined 

above . 

It is further appreciated that m determining probe 
specificity and in utilizing the method cf this 
15 invention to detect KSHV, a certain amount of 

background signal is typical and can easily be 
distinguished by one of skill from a specific signal. 
Two- fold signal over background is acceptable. 

20 A preferred method for detecting the KSHV polypeptide 

is the use of PCR and/or dot blot hybridization. 
Other methods to test for the presence or absence of 
KSKV fcr detection cr prognosis, cr risk assessment 
for KS includes Southern transfers, solution 

25 hybridization or nor.- radioact ive detection systems, 

ail of whicn are well known to tnose of skill m the 
art. Hybridization is carried out using probes. 
Visualization of the hybridized portions allows , the 
qualitative determination of the presence cr absence 
30 of the causal agent. 

Similarly, a Northern transfer or reverse 
transcriptase PCR may be used for the detection or 
KSKV messenger RNA in a sample. These procedures are 
35 also well known in the art. See Sambrook ez ai . 

1198 9 J Molecular Cloning: A Laboratory Manual (2nd 
ed . > , Cold Spring Harbor Laboratory, Vols. 1-3. 
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An alternative means for determining the presence cr 
the human herpesvirus is in situ hybridization, or 
more recently, in situ polymerase chain reaction In 
situ PGR is described in Neuvo er al . [-992) 
intracellular localization of PGR- amplified hepatitis 
C DNA, in American Journal of Surgical Pathology 
1 7(7 ;,' 683-690; Bagasra et al . (1932) Detection of 
HIV-l provirus in mononuclear ceils by m situ PGR, m 
New England Journal of Medicine 326(21) ,13e5-13Sl;^ 
and. Heniford ecal. (1993) Variation in cellular *GF 
receptor mRNA expression demonstrated by in situ 
reverse transcriptase polymerase chain reaction, m 
Nucleic Acids Research 21, 3159-3166. In situ 

hybridization assays are well known and are generally 
described m Methods Enzymol . Vol. 152, (1SS7; merger 
and Kimmel, Eds., Academic Press, New YorK. In an in 
situ hvbridization, cells are fixed to a solid 
support' typically a glass slide. The ceils are then 
contacted with a Hybridization solution at a moderate 
temperature to permit annealing of target-specific 
urobes that are labeled. The probes are preferably 
labeled with radioisotopes or fluorescent reporters. 

The above-described probes are also useful rcr m situ 
hybridization or in order to locate tissues which 
exoress the gene, or for other hybridization assays 
for the presence of the gene or its mRNA in various 
biological tissues. In situ hybridization is a 
sensitive localization method which is not aepenaent 
on expression of polypeptide antigens or native versus 
denatured conditions. 

Synthetic oligonucleotide (oiigo) probes and 
riboprobes made from KSKV phagemids or piasmids are 
also provided. Successful hybridization conditions 
in tissue sections is readily transf errabie from one 
crobe to another. Commercially- synthesized 
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oligonucleotide probes are prepared using the 
nucleotide sequence cf the identified gene. These 
probes are chosen for length (45-65 mers) . high G-C 
content (5C-70%) and are screened for uniqueness 
- aaainst other viral sequences in GenBank . 

Oligos are 3 ' end- labeled with [a- 35 S]dAT? to specific 
activities in the range of 1 x 10 :c dpm/ug using 
terminal deoxynucleotidyl transferase. Unincorpcrated 
10 labeled nucleotides are removed from the oligc probe 

by centrifugation through a Sephadex G-25 column or by 
elution from a Waters Sep Pak C-18 column. 

' v S tissue embedded in OCT compound and snap frozen m 
15 freezing isopentane cooled with dry ice is cut at 6 ^m 

intervals and thawed onto 3 -aminopropyltriethoxysilane 
rreated slides and allowed to air dry. The slides are 
then fixed in 4% freshly prepared paraformaldehyde and 
rinsed m water. Formalin- fixed, paraffin embedded KS 
20 tissues cu; at 6 jim and baked onto glass slides can 

also be used. These sections are then deparaf f mizea 
in xylenes and rehydrated through graded alcohols. 
Prehybridization in 20mM Tris pH 7.5, 0.02% Denhardt ' s 
solution, 10% dextrar. sulfate for 3C mn at 37^C is 
25 followed by hybridization overnight m a solution of 

50% f ormamiae lv/vj , 10% dextran sulfate (w/v,- , 20mK 
sodium phosphate (pK 7.4), 3X SSC, IX Denhardt ' s 
solution, 100 pig/ml salmon sperm DNA, 125 pg/nl yeast 
tRNA and the oligo probe (1C € cpm/ml) at %2 C 
20 overnight. The slides are washed twice with 3X SSC 

and twice with IX SSC for 15 minutes each at room 
lemperature and visualized by autoradiography. 
Briefly. sections are dehydrated through graded 
alcohols containing 0.3M ammonium acetate, and air 
35 dried. The slides are dipped in Kodak NTS 2 emulsion, 

exposed for days to weeks, developed, and 
counterstained with hematoxylin and eosin (K&S) . 
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Alternative immunohistochemical protocols may oe 
employed which are well known to those skilled in the 
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B . Immunologic Assays 

This invention provides a method of diagnosing 
Kaposi's sarcoma in a subject, which comprises (a) 
obtaining a suitable bodily fluid sample from the 
subject, (b) contacting the suitable bodily fluid of 
the "subject to a support having already bound thereto 
an antibody recognizing the KSHV polypeptide, so as to 
bind the antibody to a specific KSHV polypeptide 
antigen, (ci removing unbound bodily fluid from the 
support, and id) determining the level of the antibody 
bound by the antigen, thereby diagnosing Kaposi's 
sarcoma . 

This invention provides a method of diagnosing 
Kaposi's sarcoma in a subject, which comprises (a) 
obtaining a suitable bodily fluid sample from the 
subject, (b) contacting the suitable bodily fluid of 
the subject to a support having already bound thereto 
the KSHV polypeptide antigen, so as to oina the 
antigen to a specific Kaposi's sarcoma antibody, (c) 
removing unbound bodily fluid from the support, and 
(d) determining the level of the antigen bound by the 
Kaposi's sarcoma antibody, thereby diagnosing Kaposi's 
sarcoma. 



The suitable bodily fluid sample is any bodily f.uia 
sample which would contain Kaposi's sarcoma antibody, 
antigen or fragments thereof. A suitaoie bodily fluid 

includes, but is no. ^im^e- -o. Ulu 

35 cerebrospinal fluid, lymphocytes , urine, transudates, 

or exudates. In the preferred embodiment, the 

suitable bodily fluid sample is serurr. or plasma. 
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addition, the sample may be cells from bene marrow, or 
a supernatant from a cell culture. Methods of 
obtaining a suitable bodily fluid sample frcrr. a 
subiec: are known to those skilled m the art. 
5 Methods of determining the level of antibody or 
antigen include, but are not limited tc : ELI SA , IFA, 
and Western blotting. Other methods are known to 
those skilled in the art. Further, a subject infected 
with KSHV may be diagnosed as infected with the above - 
10 described methods. 

The detection of KSHV and the detection of virus - 
associated KS are essentially identical processes. 
The casic principle is to detect the virus using 
specific ligands that bind tc the virus but not to 
other polypeptides or nucleic acids in a normal human 
cell or its environs. The ligands can be nucleic acid 
molecules, polypeptides or antibodies. The ligands 
can be naturally-occurring or genetically or 
physically modified, such as nucleic acids with non- 
natural nucleotide bases or antibody derivatives, 
i.e.. Fab or chimeric antibodies. Serological tests 
for detection of antibodies to the virus present in 
subject sera may also be performed by using the KSHV 
polypeptide as an antigen, as describee nerem. 
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Samples can be taken from patients with KS or from 
patients at risk for KS , such as AIDS patients. 
Tyoically the samples are taken fro:?, biocc icens, 
serum and/or plasma) or from solid tissue samples such 
as skin lesions. The most accurate diagnosis for KS 
will occur if elevated titers cf the virus are 
detected in the blood or in involved lesions. KS may 
also be indicated if antibodies ~c the virus are 
detected and if other diagnostic factors for KS are 
present . 
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Immunoassays above for more de-ails cn 



See 

m-nunoreagents of the invention for use in diagnostic 
assays for KS . 

5 IV _ T. r c at menr of Human Herpes "- rus - Induced KS 

This invention provides a method for treating a 
subject with Kaposi's sarcoma (KS) comprising 
administering to the subject having KS a 
10 pharmaceutical^ effective amount of an antiviral 

agent in a pharmaceutical^ acceptable carrier, 
wherein the agent is effective to treat the subiect 
with KSHV. 

15 Further, this invention provides a method of 

prophylaxis or treatment for Kaposi's sarcoma {KS^. by 
administering to a patient at risk for KS , an antibody 
that binds to KSHV in a pharmaceut icaily acceptable 



rarrier . 



This invention provides a method of treating a sucject 
with Kaposi's sarcoma comprising administering to the 
Q ,,v-~«-r a - effective amount of an antisense molecule 
capaole of hybridizing to the isolated DNA molecule 
cf KSHV under conditions such that tne antisense 
molecule selectively enters a KS turner ceil of the 
subject, so as to treat the subiect. 
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A . Nucleic Acid Therapeutics 

This invention provides an an:isense molecule capable 
of hybridizing to the isolated nucleic acid molecule 
of KSHV. In one embodiment the antiser.se molecule is 
DNA. In another embodiment the antisense molecule is 
RNA. In another embodiment, the antisense molecule is 
a nucleic acid derivative (e.g., DNA or RNA with a 
protein backbone) . 

The present invention extends to the preparation of 
antisense nucleic acids and ribozymes that may be used 
to interfere with the expression of a polypeptide 
either by masking the mRNA with an antisense nucleic 
15 acid or cleaving it with a ribozyme, respectively. 

This invention provides inhibitory nucleic acid 
therapeutics which can inhibit the activity of 
herpesviruses in patients with KS by binding to the 
20 isolated nucleic acid molecule of KSHV.. Inhibitory 

nucleic acids may be single - stranded nucleic acids, 
which can specifically bind to a complementary nucleic 
acid sequence. By binding to the appropriate target 
secuence, an RNA - RNA , a DNA - DNA , or RKA-DNA duplex or 
25 triplex is formed. These nucleic acids are often 

termed "antisense" because they are usually 
complementary to the sense or coding strand cf the 
gene, although recently approaches for use of "sense" 
nucleic acids have also been developed. The term. 
3 0 "inhibitory nucleic acids" as used herein, refers tc 

both "sense" and "antisense" nucleic acics . 

By binding to the target nucleic acid, the inhibitory 
nucleic acid can inhibit tne function of the target 
3 5 nucleic acid. This could, for example, be a resu.t ot 

blocking DNA transcription, processing or poly (A) 
addition to mRNA, DNA replication, translation, or 
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promoting inhibitory mechanisms cf the cells, such as 
promoting RNA degradation. Inhibitory nucleic acid 
methods therefore encompass a number of different 
approaches to altering expression of herpesvirus 
5 genes. These different types of . inhibitory nucleic 

acid technology are described in Helene and Toulme 
(1990) Biochim. Biophys. Acta. 104S. 99-125, which is 
referred to hereinafter as "Helene and Toulme." 

i 0 In brief, inhibitory nucleic acid therapy approaches 

can be classified into those that target DNA 
seouences, those that target RNA sequences (including 
ore-mRNA and mRNA) , those that target proteins (sense 
strand approaches) , and those that cause cleavage or 
chemical modification of the target nucleic acids. 



Aooroaches targeting DNA fall into several categories. 
Nucleic acids can be designed to bind to the ma^or 
Groove of the duplex DNA to form a triple helical or 
20 "triplex" structure. Alternatively, inhibitory 

nucleic acids are designed to bind to regions 
single stranded DNA resulting from the opening cf 
duplex DNA during replication or transcription. 



ne 



More commonly, inhibitory nucleic acids are designee 
zo bind to mRNA cr mRNA precursors. Inhibitory 
nucleic acids are used to prevent maturation or pre- 
mRNA. inhibitory nucleic acids may be designed to 
interfere 
30 translation. 



with RNA processing, splicing or 



The inhibitory nucleic acids can be targeted to mRNA - 
in this approach, the inhibitory nucleic acids are 
desianed to specifically blocK translation cf the 
encoded protein. Using this approach, the inhibitory 
nucleic acid can be used to selectively suppress 
certain cellular functions by inhibition oz 
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■ translation of mRNA encoding critical proteins. For 
example, an inhibitory nucleic acid complementary 
regions of c-myc mRNA inhibits c-myc prctein 
expression in a human promyelocyte leukemia ce_. 
5 line, HL60, which overexpresses the c-myc proto- 

oncogene. See WicJcstrom et ai . (198B) PNAS 55. 1C2B- 
1032 and Harel-Beilan et ai . (1988) Exp. Med. 155, 
2309-2318. As described in Helene and Tculme , 
inhibitory nucleic acids targeting mRNA have been 
10 shown to work by several different mechanisms to 

inhibit translation of the encoded protein (sJ . 

The inhibitory nucleic acids introduced into the cell 
can also encompass the "sense" strand of the gene or 
15 mRNA to trap or compete for the enzymes or binding 

proteins involved in mRNA translation, as described in 
Helene and Toulme . 

Lastly, the inhibitory nucleic acids can be used to 
20 induce chemical mactivation or cleavage of the target 

genes or mRNA . Chemical inactivation car. occur cy the 
induction of crosslinks between the inhibitory nucleic 
acid and the target nucleic acid within the cell. 
Other chemical modifications cf the target nucleic 
25 acids induced by appropriately derivatized inhibitory 

nucleic acids may also be used. 

Cleavage, and therefore inactivation, of the target 
nucleic acids may be effected by- attaching a 
30 substituent to the inhibitory nucleic acid wnich can 

be activated to induce cleavage reactions . Tne 
substituent can be one that affects either chemical, 
or enzymatic cleavage. Alternatively, cleavage can 
be induced by the use of ribozymes or catalytic RNA. 
35 in this approach, the inhibitory nucleic acids would 

comprise either naturally occurring RNA (ribozym.es) or 
synthetic nucleic acids with catalytic activity. 
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The targeting of inhibitory nuclei: acids to specific 
cells of the immune system by conjugation witn 
targeting moieties binding receptors or. the surface c: 
-hese cells can be used for all of the above forms or 
5 inhibitory nucleic acid therapy. This .invention 
encompasses all of the forms of inhibitory nucleic 
acid therapy as described above and as descrioec ir. 
Helens and Toulme . 



10 
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An example of an antiherpes virus inhibitory nucleic 
acid is ISIS 2922 ( ISIS Pharmaceuticals) which has 
activity against CMV (see Biotechnology News 14:5: . 

' i problem associated with inhibitory nucleic acid 
-herapv is the effective delivery of the inhibitory 
-ucleic acid tc the target cell in vivo and the 
subs-auent internalization of the inhibitory nucleic 
ac < d bv that cell. This can be accomplished by 
linking the inhibitory nucleic acid tc a targeting 
20 moietv'to form a conjugate that binds to a specific 
receptor on the surface of the target infected cell, 
and which is internalized after binding. 



E . 



ft r-.r.i viral Aaer.ts 



The us- of combinations of antiviral drugs ano 
seauentiai treatments are useful for treatment cf 
h~ oesvirus infections and will also be useful ror tne 
treatment of herpesvirus - induced KS . For example, 
30 Snoeck e: al . (1992) Eur. J. Clin. Micro . infect . ^is . 

11, H44-1155, found additive cr synergistic streets 
against CMV when combining antiherpes drugs (e.g., 
combinations of zidovudine 1 3 ' - azido - 3 ' - 

d-oxvthvmidine. A2T] with KPMPC. gancic.ovir. 
, 5 -oscarnet or acyclovir or of H?K?= with otner 

antivirals). Similarly, in treatment or 

cytomegalovirus retinitis, induction with ganciclovir 
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followed by maintenance with foscarne: has Deer 
suggested as a way to maximize efficacy while 
minimizing the adverse side effects c: eitner 
treatment alone. An anti -herpet ic compos i tier, that 
contains acyclovir and, e.g., 2 -acetyipyridine - 5 - ( i 2 - 
pyridyiamino) thiocarbonyl ) - thiocarbonchydrazone is 
described in U.S. Pat. 5.175,165 (assigned to 
Burroughs Wellcome Co.). Combinations of TS- 

inhibicors and viral TK- inhibitors in antiherpetic 
medicines are disclosed in U.S. Pat. 5,127,724, 
assigned to Stichting Rega VZW. A synergistic 
inhibitory effect on EBV replication using certain 
ratios of Combinations of HPMPC with AZT was reported 
D y Lin ec ai. (1991, AntimicroJb Agents Cheznozher 
15 35:2440-3. 

U.S. Patent Nos . 5,164,395 and 5,021,437 (Blumenkopf ; 
Burroughs Wellcome) describe the use of a 
ribonucleotide reductase inhibitor (an acetyipyridine 
20 derivative) for treatment of herpes infections, 

including the use of the acetyipyridine derivative in 
combination with acyclovir. U.S. Patent No . 5,137,724 
(Salzan ec al . (1990) Mol . Phazm. 37,402-7) describes 
the use of thymidylate synthase inhibitors (e.g., 5- 
25 f iuoro-uracil and 5 - f iuro- 2 ' -deoxyuridine ) in 

combination with compounds navmg viral tnymicine 
kinase inhibiting activity. 

With the discovery of a disease causal agent for KS 
30 now identified, effective therapeutic or prophylactic 

protocols to alleviate or prevent the symptoms of 
herpes virus - associated KS can be formulated. Due to 
the viral nature of the disease, antiviral agents have 
application here for treatment, such as interferons, 
3 5 nucleoside analogues, ribavirin, amantadine, and 

pyrophosphate analogues of phosphenoacet ic acid 
(foscarnet) (reviewed in Gorbach ez al . , 1952, 
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infeczious Disease Ch.35, 285, W.E. Saunders, 
Philadelphia, Pennsylvania) and the nke- 
immunological therapy will also be effective :r. ir.any 
cases to manage and alleviate symptoms caused by the 
5 disease agents described here. Antiviral agents 

include agents or compositions that directly bind tc 
viral products and interfere with disease progress; 
and, excludes agents that do not impact directly on 
viral multiplication or viral titer. Antiviral agents 
do not include immunoregulatory agents that dc not 
directly affect viral titer or bind to viral products. 
Antiviral agents are effective if they inactivate the 
virus, otherwise inhibit its mfectivity or 
multiplication, or alleviate the symptoms of KS . 

The antiherpesvirus agents that will be useful for 
treating virus - induced KS can be grouped mtc broad 
classes based on their presumed modes cf action. 
These classes include agents that act (1) oy 
inhibition of viral DNA polymerase, <2i oy targeting 
other viral enzym.es and proteins, (3) by miscellaneous 
or incompletely understood mechanisms. or (4) by 
binding a target nucleic acid (i.e., inhibitory 
nucleic acid therapeutics, supra) . Antiviral agents 
may aiso'be used in combination (i.e., together or 
sequentially) to achieve synergistic or additive 
effects or other benefits. 

Although it is convenient to group antiviral agents r>y 
rheir supposed mechanism of action, the applicants do 
not intend to be bound by any particular mechanism of 
antiviral action. Moreover, it will oe understood by 
those cf skill that an agent may act on more than one 
carget in a virus or virus - infected cell or througn 
more than one mechanism. 
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Many ar.t iherpesvirus agents m clinical use or m 
development today are nucleoside analogs believed :c 
act through inhibition of viral DKA replication, 
especially through inhibition of viral DNA poly-erase . 
5 These nucleoside analogs act as alternative substrates 

for the viral DNA polymerase or as competitive 
inhibitors of DNA polymerase substrates. Usually 
these agents are preferentially phosphoryiated by 
viral thymidine kinase (TK) , if one is present, and/or 
10 have higher affinity for viral DNA polymerase than for 

the cellular DNA polymerases, resulting in selective 
antiviral activity. Where a nucleoside analogue is 
Incorporated into the viral DNA, viral activity or 
reproduction may be affected in a variety cf ways. 
For examole, the analogue may act as a cnam 
terminator, cause increased lability (e.g., 
susceptibility to breakage) of analogue - containing 
DNA^ and/ or impair the ability of the substituted DNA 
io act as template for transcription cr replication 
20 (see, e.g., 3aizarmi ez al . . supra). 

It will be known to one of skill that, like many 
drugs, many of the agents useful for treatment of 
herpes virus infections are modified (i.e. ^ 
25 "activated" ? by the host, host cell, or virus - mrectea 

host cell metabolic enzymes. For example, acyclovir 
is triphosphoryiated to its active form, with the 
first phosphorylation being carried out by the herpes 
virus thymidine kinase, when present. Other examples 
3C are the reported conversion of the compound HOE €Z2 to 

ganciclovir m a three-step metabolic pathway (Kinkier 
~ez al.. 1990, Antiviral Research 14, £1-74) and the 
phosphorylation of ganciclovir to its active form by, 
e.g.- a CMV nucleotide kinase. It will be apparent to 
35 one of skill that the specific metabolic capabilities 

cf a virus can affect the sensitivity cf that virus to 
specific drugs, and is one factor in tne cnoice cr an 
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antiviral drug. The mechanism c, action s. =■-■■ 

ant i -herpesvirus agents is discussed in De cle =- c " 
(1993, Antimicrobial Chemotherapy 32, Su??_ . A. 12_- 
13 2) and in other references cited supra and ir.rrs. 

Anti- herpesvirus medications suitable for treating 
viral induced KS include, but are not limited to, 
nucleoside analogs including acyclic nucleoside 
phosphonate analogs (e.g., phosphonyl- 

10 methoxyalkyipurines and -pyrinidines ) , and cyclic 
nucleoside analogs. These include drugs such as-, 
vidarabine « 9-0-D-arabincf uranosyladenine ; adenine 
arabinoside. ara-A, Vira-A, Parke -Davis ) ; 
arabmof uranosyluraci 1 (ara-U); j. - B - D - 

, 5 arabinofuranosyl-cytosine (ara-C! ; KPMPC [(Sl-l-13- 

hydroxv- 2 - (ohosphony Imethoxy) propyl ]cytosine (e.g., GS 
504, Gilead Science)] and its cyclic form -.cKPMPC); 
uom^A [ (S) -5- (3-hvdroxy-2-phosphonylmethc.xypropyl) 
adenine] and its . cyclic form. (cHPMPA) ; ( S ) - HPKPDAP 
20 [ (S! -5- (3-hydroxy-2-phosphonyimethoxypropyl; -2.6- 

diammcpurinej ; PMEDA? [9- ( 2 -phosphcnyl -methoxyethyi ) - 
2,6-diaminopurme] ; HOE 602 ; 2 - ammo- 9 - ( 1 . 3 - 

bis (iscpropoxy) -2 -propoxymethyli purine] ; PMEA [S-(2- 
phosphonylmethoxyethyl! adenine] ; bromovinyl- 
25 deoxyuridine (Burns and Sandford, 1990, J. Infect. 

Dxs. 162:634-7); 1 - B -D- arabmof uranosy 1 - E- 5 - ! 2 - 

bromovinyl) -uridine or - 2 ' -deoxyuridine ; BVaraU {1-0- 
D-arabinofuranosyl-E-5- ! 2 - bromovinyl ) -uracil , 
brovavir, Bristol-Myers Squibb. Yamsa Shoyu) ; 3VDU 
3C ; !E) _ 5 . (2-bromovinyl) -2' -deoxyuridine , brivucm. e.g., 

Helpin] and its carbocyclic analogue dr. which the 
sugar moiety is replaced by a cyciopentar.e ring) ; ZVDU 
r ( r ) - 5 - ( 2 - io do vinyl ) - 2 ' - deoxyuridine] and its 

-arbocvclic analogue, C-IVDU !3alzarini er aJ . , 
35 suora); and 5-mercutithio analogs of 2 '- deoxyuridine 

(Hclliday and Williams. 1592, Aazzmicrob. Agents 
Chemother. 36, 1935); acyclovir [5- ([2- 
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hydroxyethoxy] methyl) guanine; e.g., Zovirax (Burroughs 
Wellcome) 1 ; penciclovir (P- ^-hydroxy-- 

(hydroxymethyl) butyl] -guanine) ; ganciclovir [;r-;i,3- 
dihydroxy-2 propoxw.ethyl] -guanine) e.g.", Cytnsvsne, 
5 cytovene iSyntex), DHPG (Seals. er al,, 1993 , 

Antimicrobial Agents Chemother. 37, 215-223; 
isoprcpyiether derivatives of ganciclovir (see, e.g.. 
Wmkelmann er al . . 1988, Drug Res. 38.. 1545-1548); 
cyaalovir; famciclovir [2-amino-9- ( 4 - acetoxy- 3 - 
10 (acetoxymeLhyllbut-l-yDpurme (Smithkline BeechamH ; 

valacyclovir (Burroughs Wellcome); desciclovir *(2- 
amino-9- { 2 -ethoxymethyl ) purine ) ] and 2 -amino- 5- (2- 
nydroxyethoxymethyl) -9H-purine, prodrugs of 

' acyclovir]; CDG (carbocyclic 2 ' -deoxyguanosme .■ ; and 
15 purine nucleosides with the pentaf uranosyl ring 

replaced by a cyclo butane ring (e.g., cycicbut-A [(-- 

- 9 - [l^,2ff ( 30) - 2 , 3 - bi s ( hydrcxyme t hy 1 : - 1 - 
cyciobutyl] adenine] , cyclobut-G [ < + - > - 9 - [ 1/? , 2a . 30} - 
2, 3 -bis (hydroxymethyl} - 1 - cyclcbutyi] guanine] , BHCG 
2C [ ' ( R i 1 Of , 2 £ , 1 ) - 9- ^(2,3 - 

bis (hvdroxymethyl) cyciobutyl] guanine] , and an active 
Isomer cf * racemic BHCG , SO 34,514 [ 1R- lor , 2£ , 3a ) - 2 - 
amino- 9- [2 , 2 -bis (hydroxymethyl ) cyclcbutyi 3 -6H-?ur:r.-c- 
one tsee, .Braitman et ai . , 1991, Antiznicrcb. Agents 
25 and Chemotherapy 35, 1464-1468.. Certain cf these 

antiherpesviral agents are discussed in Gorach et al . , 
1992, infectious Disease Ch . 3 5 . 269, W.B. Saunders, 
Philadelphia; Saunders et al., 1590, J. Acquir . Ir^une 
Defir. Syndr. 3, 571; Yamanaka et al . , 1991, Mo J - 
30 Pharmacol. 40, 446; and Greenspan et al . , 1991. J. 

Acquir. ' Immune Defic. Syndr. 3, 371. 

and triciribine monophosphate are potent 



viruses. '. I ekes et al . , 



Tricirioine 

inhibitors against herpes 

1994, Antiviral Research 23, Seventh International 
Conf . 



on Antiviral Research, Abstract No. 122, Supp - 

, ^ , =, - aa~. . A ~7?' c Res . 
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Human Retroviruses 9, 307-314! anc ar< 
nucleoside analogs that may be used to treat KS 
exemolary protocol for these agents is ar. intravenous 
Section of about 0.35 mg/meter : (0.7 mg/ksi once 
5 weekly or every other week for at least two doses, 
preferably up to about four to eight weeks. 

Acyclovir and ganciclovir are of interest because of 
their accepted use in clinical settings. Acyclovir, 
10 an acyclic analogue of guanine, is phosphcrylated by 
a herpesvirus thymidine kinase and undergoes further 
ohosohorylation to be incorporated as a chair, 
terminator by the viral DNA polymerase during viral 
—Plication. It has therapeutic activity against a 
,5 broad range of herpesviruses. Herpes simplex Types 1 

anc : 2 Varicella- Zoster, Cytomegalovirus, and 
-os-e^-Barr Virus, and is used to treat disease such 
as herpes encephalitis. neonatal herpesvirus 
infections, chickenpox in immunocompromised hosts, 
20 heroes zoster recurrences, CMV retinitis, EBV 

/= = tions, chronic fatigue syndrome. and nairy 
leukoplakia m AIDS patients. Exemplary intravenous 
dosaaes or oral dosages are 250 mg/kg/rrv body surface 
area" every 6 hours for 7 days, or maintenance doses 
of 200-400 mg IV cr orally twice a day to suppress 
recurrence. Ganciclovir has been shown tc be more 
ac-v» than acyclovir against some herpesviruses. See, 
e.g., Oren and Soble , 1991. Clinical Infectious 
Diseases 14. 741-6. Treatment protocols ror 

ganciclovir are 5 mg/kg twice a day IV or 2.5 mg/kg 

*«,--ri^/-^avj > c: Ks^'enance doses 
three z imes a a ay tor xG-14 cays . 

are 5-6 mg/kg for 5-7 days. 



Also of interest is KPMPC . HPMFC is repcrtec tc oe 
more active than either acyclovir cr ganciclovir m 
the chemotherapy and p 



orcohvlaxis of various r.sV j. , 
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HSV-2 , TK- HSV , VZV or CMV infections ir. an:ir 10 - mcct-s 
(De Clercq, supra) . 

Nucleoside analogs such as BVaraU are potent 
inhibitors of KSV-1. EBV, and VZV tha: have greater 
activity than acyclovir in animal models of 
encephalitis. FX AC ( f luroidoarbinosyl cytosins'- and 
its related f iuroethyl and iodo compounds ( e . g • , FEAU , 
EIAU) have potent selective activity against 
herpesviruses, and HPMPA ( (S) -I- f [ 3 -hydroxy- 2 - 
pnosphorylme thoxy] propyl ) adenine ) has seen 

demonstrated to be more potent against HSV and CMV 
than acyclovir or ganciclovir and are of choice in 
advanced cases of KS . Ciadribme (2- 

chlorcdeoxyadenosine) is another nucleoside analogue 
known as a highly specific ant i lymphocyte agent (i.e., 
a immunosuppressive drug) . 

Other useful antiviral agents include: 5 - thien- 2 -yi - 
2 ' -deoxyuridine derivatives, e.g., ETDU [5-5(5- 
cromcthien-2-yl) -2 ' -deoxyuridine] and CTDU fo- (5- 
chiorcthien-2-yi) -2 ' -deoxyuridine] ; and OXT-A [5-(2- 
deo^-2-hydroxymethyi-iS-D-erythro-cxetanosyi-; adenine] 
and OXT-G [9- ( 2 - deoxy - 2 - hydroxymethy 1 - £ ~ D- erytnro- 
oxetanosyl) guanine] . Although OXT-G is believed to 
act by inhibiting viral DMA synthesis its mechanism of 
action has not yet been elucidated. These and ether 
compounds are described in Andrei et al . . 1992, Eur. 
Clin. Microbiol. Zafecz. Dis. II. 142-51. 
30 Additional antiviral purine derivatives useful in 

treating herpesvirus infections are disclosed m US 
Pat. 5,108,9 94 (assigned tc Beecham Group ? . 1 . C . ; • c " 
Methoxvourine arabinoside iara-K; Burroughs Wei-come) 
is a potent inhibitor cf van sella - zoster virus, and 
■J 5 will be useful for treatment cf KS . 
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Certain thymidine analogs [e.g.. idoxuridme (5-ido- 
2' -deoxyuridme)] and trif iurcthymidine ) have 
antiheroes viral activity, but due tc their systemic 
toxicity, are largely used for topical herpesviral 
5 infections. including HSV stromal keratitis and 

uveitis, and are not preferred here unless other 
options are ruled out. 

Other useful antiviral agents that have demonstrated 
antiherpes viral activity include foscarnet sodium 
(trisodium phosphoncformate, PFA, Foscavir (Astra)) 
and phosphonoacetic acid (PAA) . Foscarnet is an 
inorganic pyrophosphate analogue that acts oy 
competitively blocking the pyrophosphate -binding site 
of ~DNA polymerase. These agents which block DNA 
polymerase directly without processing by viral 
thymidine kinase. Foscarnet is reported to oe less 
toxic than PAA . 
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ii i Or her Antivirals 

Although applicants do net intend to be bound by a 
particular mechanism of antiviral action. tr.-r 
antiherpes-virus agents described above are .believed 
to act 'through inhibition of viral DNA polymerase. 
However, viral replication requires not only the 
replication of the viral nucleic acid but also the 
production of viral proteins and other essential 
components . Accordingly, the present invention 

contemplates treatment of KS by tne inhibition of 
viral proliferation by targeting viral proteins other 
than DNA polymerase (e.g., by inhibition of their 
synthesis or activity, or destruction of viral 
proteins after their synthesis). For example, 

administration of agents that inhibit a viral serine 
protease, e.g., such as one important m development 
of tne viral capsid will be useful in treatment of 
viral induced KS . 



0 „ h£r viral enzyme targets include: OM? decarboxylase 
inhibitors (a target of. e.g., parazofurm; , CTP 
synthetase inhibitors (targets of, e.g., 

cyciopentenyicytosme) , IMP dehydrogenase, 

ribonucleotide reductase ia target of, e.g., carbbxyl- 
• containing N - alkyidipept ides as described m U.S. 
Patent No. 5,110,795 (Toiman ez ai . , Merc*!), 
thymidine kinase (a target of, e.g., --i2- 
(hydroxymethyl) eye loaiky Imet hy 1 3 - 5 - subs t i tut ed 
30 -uracils and -guanines as described in, e.g., U.S. 

Patent Nos . 4,863,927 and 4.732. 0c2 (Tolman ez al . . 
Merck) as well as other enzymes. Iz will be apparent 
to one of ordinary skill in tne art that there are 
additional viral proteins, both characterized and as 
35 yet to be discovered, that can serve as target for 

antiviral agents. 
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Kutapressin is a liver derivative available from 
Schwarz Parma of Milwaukee. Wisconsin in an :n}ec:acie 
form of 25 mg/ml - The recommended dosage fcr 
herpesviruses is from 200 to 25 mg/mi per day for an 
average adult of 150 pounds. 

Poly (I) ?oly (C 1= U) , an accepted antiviral drug known as 
Ampligen from HEM Pharmaceuticals of Rockviiie, KZ has 
been shown to inhibit herpesviruses and is another 
antiviral agent suitable for treating KS . Intravenous 
infection is the preferred route of administration. 
Dosages from about 100 to 600 mg/m 3 are administered 
- wo co -hree times weekly tc adults averaging 150 
15 pounds. It is best to administer at least 200 ir.g/m* 

oer week . 

Other antiviral agents reported to show activity 
against herpes viruses (e.g., varicella zoster and 
20 nerpes simplex; and will be useful for the treatment 

of herpesvirus- induced KS include mappicine ketone 
(SmithKline Beecham) ; Compounds A,7929€ and A, 72209 
(Abbott 1 ; for varicella zoster, and Compound 3S2CS7 
(Burroughs Wellcome) (see, The Fink Sheez 35(20; May 
2 5 17, 1993';. 

Interferon is known inhibit replication of herpes 
viruses. See Oren and Soble, supra. Interferon has 
■Known toxicity problems and it is expected that second 
aeneration derivatives will soon be available tnat 
will retain interferon's antiviral properties but have 
reduced side affects. 



It is also C( 



r.templated that herpes virus - induced KS 
may be treated by administering a herpesvirus 
reactivating agent tc induce reactivation or tne 
latent virus. Preferably the reactivation is ccmrmec 
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with simultaneous or sequential administra-:on of an 
anti-herpesvirus agent. Controlled reactivation over 
a short period of time or reactivation m the presence 
of an antiviral agent is believed to minimize the 
5 adverse effects of certain herpesvirus infections 

(e.g., as discussed in PCT Application WO 93/04553} . 
Reactivating agents include agents such as estrogen, 
phorbcl esters, forskclin and <3- adrenergic blocking 
aoents . 
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Agents useful for treatment of herpesvirus infections 
and for treatment of herpesvirus- induced KS are 
described in numerous U.S. Patents. For example, 
ganciclovir is an example of a antiviral guanine 
acyclic nucleotide of the type described in US Patent 
Nos. 4,355,032 and 4,603,219. 



Acvclovir is an example o 



f a class of antiviral purine 
including 9 - : 2 - 



derivatives , 
hydroxyethylmethyl) adenine, of the type described in 
V.s. Pat. Nos. 4,287, 16B, 4,294, 631 and 4,199,574. 



of an antiviral deoxyuridine 
ieriva- ive of the type described in US Patent Nc . 



3r ivudin i s an example 
d 

4 ,424 , 211 . 



Vidarabine is an example of an antivira. purine 
nucleoside of the type described in British Pat. 
1,159,290. 

Brovavir is an example of an antiviral deoxyuridine 
derivative of the type described in US Patent Nos. 
4,542.210 and 4,386.076. 

35 BHCG is an example of an antivira- caroocyciic 

nucleoside analogue of the type described in US Patent 
Nos. 5,153,352, 5,034,394 and 5,126,345. 
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KPMPC is an example cf an ancivira. pr.ospncny- 
methoxyalkyl derivative v 
US Paten: No. 5,142,051. 



methoxyalkyl derivative with cf the type aescri Dec in 



CDG (Carbocyclic 2 ' -deoxyguanosine ) is an example or 
an antiviral carbocyclic nucleoside analogue cf the 
type described in US Patent Nos . 4,543,255, 4.855,465, 
and 4 , 694 , 455 . 



met is described in US Patent No. 4,339,445. 



10 Fosca 

Trif luridine and its corresponding ribonucleoside is 
described in US Patent No. 3,201,387. 

U.S. Patent No. 5,321,030 ( Kaddurah-Daouk et al . : 
Amira) describes the use of creatine analogs as 
ancmerpes viral agents. U.S. Patent No. 5,306,722 
(Kim et al . ; Bristol -Meyers Squibb; describes 
thymidine kinase inhibitors useful for treating HSV 
infections and fcr inhibiting herpes thymidine kinase. 
Other antiherpesvirus compositions are describee in 
U.S. Patent Nos. 5,286,645 and 5,098,70= (Kcmshi ez 
al., Bristol -Meyers Squibb) and 5,175,165 (Slumer.kopf 
ez al . ; Burroughs Wellcome). U.S. Patent No. 

4,860,520' (Ashton e: al . , Merck) describes the 
antiherpes virus agent ( S ; - 9 - < 2 , 5 - dihydrcxy- 1 - 
propoxymethyi ) guanine . 

U.S. Patent No. 4,708,935 (Suhadolnik et al . , Researcr. 
30 Corporation) describes a 3 ' -decxyadenosine compound 

effective in inhibiting HSV and EBV . U.S. Patent No. 
4,386,076 (Machida et al . , Yamasa Shoyu Kabushiki 
Kaisha) describes use of 

(E) -5- (2-haiogenovinyl) - arabincf urancsy luracil as an 
35 antiherpesvirus agent. U.S. Patent No. 4,340,599 

(Lieb et al . , Bayer Akt iengesel Iscnaf t i describes 
chosphcnohydroxyacetic acid derivatives useful as 
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ar.:iher?es agents. U.S. Patent Nos. 4. 053. 715 and 
4,053,716 (Lin et al . , Research Corpora-ion) describe 
5 ' - amino - 5 ' - deoxy t hymi dine and 5 - iodo- 5.' - 

amino-2' , 5' -dideoxycycidine as potent inhibitors or 
5 herpes simplex virus. U.S. Patent No. 4,065.352 

{Baker et al . , Parke, Davis & Company) describes 
9- ;5-o-Acyl-beta-D-arabinofuranosyi) adenine compounds 
useful as antiviral agents. U.S. Patent No. 3,527,216 
(Witkowski et al.) describes the use cf 
10 1(2 ,4-triazole-3-carboxamide and 

i(2 i4 _-. riazo le-3-chiocarboxamide for inhibiting herpes 
virus infections. Patent No. 5,175,053 (Afcnso ez 
al , Schering) describes quinoiine - 2 , 4 -dione 
derivatives active against herpes simplex virus 1 and 
2 cvtomeaalovirus and Epstein Barr virus - . 
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iii; Administration 

The sur-ects to be treated or whose tissue may be used 
herein nay be a mammal, or mere specifically a human, 
horse, pig, rabbit, dog, monkey, or rodent. In the 
preferred embodiment the subject is a human. 



The compositions are administered in a manner 
compatible with the dosage formulation, and m a 
therapeutically effective amount. Precise amounts of 
active ingredient required to be administered depend 
on the judgment of the practitioner and are peculiar 
zo each subject. 

Suitable regimes for initial administration and 
booster shots are also variable, but are typified by 
an initial administration followed by repeated doses 
at one or mere hour intervals by a subsequent 
injection or other administration. 
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As used herein administratis:! means s. methoc -- 
administering to a subnecc. Such methods are well 
known to thos* skilled ir. the art and include, suture 
not limited to, administration topically, 
parenterally. orally, intravenously, intramuscularly, 
subcutaneously or by aerosol. Administration of the 
agent may be effected continuously or intermittently 
such that the therapeutic agent in the patient is 
effective to treat a subject with Kaposi's sarcoma cr 
a" subject infected with a DNA virus associated with 
Kaposi's sarcoma. 



preferably administered to human 



The antiviral compositions for treating herpesvirus - 

induced KS are 
patients via oral, intravenous or parenterax 
administrations and other systemic forms. Tr.ose or 
skill in the art will understand appropriate 
administration protocol for the individual 
compositions to be employed by the physician. 



The pharmaceutical f emulations or compositions of 
-Vs invention may be m the dosage form of solid, 
semi-solid, or liquid such as, e.g., suspensions, 
aerosols or the like. Preferably the compositions are 
administered in unit dosage forms suitac.e ror smgxe 
administration of precise dosage amounts. The 
compositions may also include, depending on the 
formulation desired, pharmaceuticaliy-acceptabie , non- 
toxic carriers or diluents, which are defined as 
vehicles commonly used to formulate p.-armaceuticai 
compositions for animal or human administration. The 
diluent is selected so as not to affect the biological 
activity of the combination. Examples of such 
diluents are distilled water, physiological saline, 
Ringer's solution. dextrose solution. and Hank's 
solution. in addition, the pharmaceutical composition 
cr formulation may also include other carriers. 
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adjuvants; cr nontoxic, nontherapeutic , nonimnur.ogsr.:: 
stabilizers and the like. Effective amounts cf such 
diluent or carrier are those amounts which are 
effective to obtain a pharmaceut ically acceptable 
5 formulation in terms of solubility of components, cr 

bioiogical activity, etc. 
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immunological Approaches Therapy 



Having identified a primary causal agent cf KS in 
humans as a novel human herpesvirus, there are 
immunosuppressive therapies that can modulate the 
immunologic dysfunction that arises from the presence 
of viral-infected tissue. In particular, agents that 
block the immunological attack of the viral - infected 
cells will ameliorate the symptoms of KS and/cr reouce 
disease progression. Such therapies include 

antibodies that prevent immune system targeting or 
viral -infected cells. Such agents include antibodies 
which bind to cytokines that otherwise upreguiate the 
immune system m response to viral infection. 

The antibody may be administered to a patient either 
singly cr m a cocktail containing two cr more 
antibodies, otner therapeutic agents, compositions, or 
the like, including, but not limited to, immuno- 
suppressive agents, potentiators and side-effect re- 
lieving agents. Of particular interest are immuno- 
suppressive agents useful in suppressing allergic re- 
actions of a host. Immunosuppressive agents cf inter- 
est include prednisone, prednisolone, DE CAD RON (Merck, 
Sharp * Dohme, West Point, PA), cyclophosphamide, 
cyciosporine , 6 -mercaptopurme , methotrexate , 

azathioprir.e and i.v. gamma gloouim cr tneir 
combination. Potentiators cf interest inc.uae 

monensm, ammonium chloride and chloroquine. All of 
these agents are administered in generally accepted 
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efficacious cose ranges such as those disclosed in the 
Physician Desk Reference. 41st Ed. (1957), Publisher 
Edward R. Barnnart, New Jersey. 

5 Immune globulin from persons previously infected with 

human herpesviruses or related viruses can be obtained 
using standard techniques. Appropriate titers of 
antibodies are Known for this therapy and are readily 
aooiied to the treatment of KS . Immune globulin can 

10 be administered via parenteral injection or by 

intrathecal shunt. In brief, immune globulin 

preparations may be obtained from individual donors 
who are screened for antibodies to the KS-associated 
human Herpesvirus, and plasmas from high-titered 

15 donors are pooled. Alternatively, plasmas from donors 

are pooled and then tested for antibodies to the human 
herpesvirus of the invention; high-titered pools are 
-hen selected for use in KS patients. 

20 Antibodies may be formulated into an injectable 

preparation. Parenteral formulations are known and 
are suitable for use in the invention, preferably for 
i.m. or i.v. administration. The formulations 

containing therapeutically effective amounts of 
25 antibodies or immunotoxins are either sterile liquid 

solutions, liquid suspensions or lyophilized versions 
and optionally contain stabilizers or excipients. 
Lyophilized compositions are reconstituted with 
suitable diluents, e.g., water for infection, saline, 
30 0.3% glycine and the like, at a level of about from 

.01 mg/kg of host body weight to 10 mg/kg where 
appropriate. Typically, the pharmaceutical 

compositions containing the antibodies cr immunotoxins 
will be administered m a therapeutically effective 
35 dose in a range of from about .01 mg/kg to about 5 

mg/kg of the treated mammal. A preferrea 

therapeutically effective dose of the pharmaceutical 
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composition containing antibody cr innunoccxin wi.l z>e 
in a range of from about 0.01 mg/kg to about C.5 mg/kg 
body weight of the treated mammal administered ever 
several days to two weeks by daily intravenous 
5 infusion, each given over a one hour period., m a 

sequential patient dose-escalation regimen. 

Antibody may be administered systemicaliy by in-ection 
i.m., subcutaneous ly or intraperitoneally or directly 

10 mto K£ lesions. The dose will be dependent upon the 

properties of the antibody or immunotoxm employed, 
e.g., its activity and biological half-life, the 
concentration of antibody in the formulation, the site 
and rate of dosage, the clinical tolerance cf the 

15 patient involved, the disease afflicting tne patient 

and the like as is well within the skill of the 
physician . 
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The antibody of the present invention may be 
administered in solution. Tne pH of the solution 
should be m the range of pH 5 to 9.5, preferably pH 
6.5 cc 7.5. The antibody cr derivatives thereof 
should be in a solution having a suitable 
charmaceuticaiiy acceptable buffer such as phosphate, 
25 tris (nydroxymethyi) am mome thane -HCl cr citrate and 

the like. Buffer concentrations should be in tne 
range cf I to 100 mK . The solution cf antibody may 
also contain a salt, sucn as sodium chloride cr 
cotassium chloride in a concentration of 50 to 150 mK. 
An effective amount of a stabilizing agent sucn as an 
albumin, a globulin, a gelatin, a protamine cr a salt 
of protamine may also be included and may be added to 
a solution containing antibody cr immunotoxm or to 
the composition from which the solution is prepared. 



Systemic administration of antibody is made daily, 
Generally by intramuscular injection, althougn 
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intravascular infusion is acceprable. Adminisz raz.cn 
may also be intranasal or by other nonparent era! 
-outes . Antibody or immunotoxm may als: oe 

administered via microspheres, liposomes cr ether 
5 microparticulate delivery systems placed m certain 

-issues including blood. 

In therapeutic applications, the dosages of compounds 
used in accordance with the invention vary depending 
10 cn the class of compound and the condition being 

treated. The age, weight, and clinical condition of 
the recipient patient; and the experience and judgment 
cf the clinician or practitioner administering tne 
therapy are among the factors affecting the selected 
dosage. For example, the dosage cf an immunoglobulin 
can range from about 0.1 milligram per kilogram of 
body weight per day to about 10 mg/kg per day for 
polyclonal antibodies ana about 5% to about 20% or 
that amount for monoclonal antibodies. In such a 
20 case, the immunoglobulin can be administered once 

daily as an intravenous infusion. Preferably.- the 
dosage is repeated daily until either a therapeutic 
result is achieved or until side effects warrant 
discontinuation of therapy. Generally. tne dose 
25 should be sufficient to treat or ameliorate symptoms 

or signs of KS without producing unacceptable toxicity 
:c the patient . 

An effective amount cf the compound is that wmch 
30 provides either subjective relief cf a symptom (s or 

an objectively identifiable improvement as noted by 
the clinician or other qualified observer. The dosing 
range varies with the compound used, the route or 
administration and the potency of the particular 
3 5 compound . 
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VI . Valines «nrf Prochvlaxis for KS 

This invention provides substances suitable fcr us= as 
vaccines fcr the prevention cf KS and tnethcds fcr 
administering them. The vaccines are directed- against 
KSHV and most preferably comprise antigens obtained 
from KSHV. In one embodiment, the vaccine contains 
attenuated KSHV. In another embodiment, the vaccine 
contains killed KSHV. In another embodiment, the 
vaccine contains a nucleic acid vector encoding a KSHV 
polypeptide. In another embodiment, the vaccine is a 
subunit vaccine containing a KSHV polypeptide. 



This invention provides a recombinant KSHV virus wit*, 
a gene encoding a KSHV polypeptide deleted from the 
genome. The recombinant virus is useful as an 
attenuated vaccine to prevent KSHV infection. 

This invention provides a method of vaccinating a 
20 subject against Kaposi's sarcoma, comprising 

administering to the subject an effective amount of 
the peptide or polypeptide encoded by the isolated DNA 
molecule, and a suitable acceptable carrier, t: hereby 
vaccinating the subject. In one embodiment naked DNA 
25 is administered to the subject m an effective amount 

to vaccinate the subject against Kaposi's sarcoma. 

This invention provides a method of immunizing a 
subject against disease caused by KSKV which comprises 
30 administering to the subject an effective immunizing 

dose of an isolated herpesvirus subunit vaccine. 

A. Vaccines 

35 The vaccine can be made using synthetic peptide or 

recombinantiy-producec polypeptide described above as 
antigen. Typically, a vaccine will include from about 
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1 co 50 micrograms of antigen. More preferably, the 
amount of polypeptide is from about 15 to acout 4 5 
micrograms. Typically, the vaccine is f ormulatec j;c 
-hat a dose includes about 0.5 milliliters. Tne 
5 vaccine may re administered by any route known in the 

art. Preferably, the route is parenteral. Xcre 
preferably, it is subcutaneous or intramuscular. 

There are a number of strategies for amplifying an 
antigen's effectiveness, particularly as related to 
the "arc of vaccines. For example, cyclizaticn or 
circuianzation of a peptide can increase ^the 
peptide's antigenic and immunogenic potency. See U.S. 
Pat. No. 5,001.049. More conventionally, an antigen 
can be conjugated to a suitable carrier, usually a 
protein molecule. This procedure has several facets. 
I: can allow multiple copies of an antigen, such as a 
peptide, to be conjugated to a single larger carrier 
molecule. Additionally, the carrier may possess 
properties which facilitate transport, binding, 
absorption cr transfer of the antigen. 

For parenteral administration, such as subcutaneous 
injection. examples of suitable carriers are the 
25 tetanus toxoid, the diphtheria toxoid, serum albumin 

and iamorey, or keyhole limpet, hemocyar.ir. because 
zhey provide the resultant conjugate with minimum 
genetic restriction. Conjugates including these 
universal carriers can function as T cell clone 
30 activators in individuals having very different gene 

sets . 

The conjugation between a peptide and a carrier can oe 
accomplished using one of the methods know- m tne 
35 . Specifically, the conjugation can use 

Afunctional cross - linkers as binding agents as 
detailed, for example, by Means and Feeney, "A recent 
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review of protein modification techniques." 
Bioconjugate Chem. 1, 2-12 (1990). 

Vaccines against a number of the Herpesviruses have 
been successfully developed. Vaccines against 

Varicella-Zoster Virus using a. live attenuated Oka 
strain is effective in preventing herpes zoster in the 
elderly, and in preventing chickenpox in both 
immunocompromised and normal children (Hardy, I - ez 
al., 1SS0, Inf. Dis. Clin. N. Amer . 4, 159; Hardy, I. 
en al., 1991.- New Engl . J- Med. 325, 1545; Levin, M.J. 
ez al.. 1992, J. Inf. Dis. 166, 253; Gershon, A.A. , 
1992. J. Inf. Des. 166<Suppl). 563. Vaccines against 
Herpes simplex Types 1 and 2 are also commercially 
available with some success in protection against 
orimary disease, but have been less successful in 
preventing the establishment of latent infection in 
sensorv ganglia (Rcizman, B. , 1991. Rev. Inf. Disease 
13(Suppl" 11;. S592; Skinner. G.R. ez al . , 1992, Med. 
Microbiol. Immunol. 18 0, 3 0 5). 

Vaccines against KSHV can be made from the KSHV 
envelope glycoproteins. These polypeptides can be 
ourifiec and used for vaccination SLasky, L.A.. 1990. 
J Med. Virol. 31, 59). MHC-binding peptides from 
cells infected with the human herpesvirus can be 
identified for vaccine candidates per the methodology 
of Marloes, et al., 1991, Eur. J. Immunol. 21, 2963- 
2970. 

The KSHV antigen may be combined or mixed with various 
solutions and other compounds as is known m the art. 
For example, it may be administered in water, saline 
or buffered vehicles with or without various adjuvants 
or immunodiluting agents. Examples of such adjuvants 
or agents include aluminum hydroxide, aluminum 
phosphate, aluminum potassium sulfate (alum), 
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beryllium sulfate, silica, kaolin, carbon, waterm- 
oil* emulsions, oil-in-water emulsions, muranyl 
dipeptide, bacterial endotoxin. lipid X . 

Corynebacterium parvum (Propionibacterium acnes) . 
5 Bordetella pertussis, polyribonucleotides, scdium 

alginate, lanolin, lysoiecithm , vitamin A, saponin, 
liposomes, levamisole, DEAE-dextran . blocked 

copolymers or other synthetic adjuvants. Such 
adjuvants are available commercially from various 
10 sources, fcr example, Merck Adjuvant 65 (Merck and 

Comoany! Inc., Rahway, N.J.) or Freund's Incomplete 
Adjuvant and Complete Adjuvant (Difco Laboratories, 
Detroit, Michigan). Other suitable adjuvants are 
Amonigen (oil - in- water J , Alhydrogel (aluminum 
15 hydroxide), or a mixture of Amphigen and Alhydrogel. 

Only aluminum is approved for human use. 

The proportion of antigen and adjuvant can be varied 
over' a broad range so long as bcth are present in 
effective amounts. For example, aluminum hydroxide 
ran be present in an amount cf about 0.5% or tne 
vaccine mixture IA1 2 0, basis). On a per-aose basis, 
the amount of the antigen can range from about 0 . 1 
to about 100 ug protein per patient. A preferaoie 
range is from about 1 M g to abcut 50 ug per dose. A 
more preferred range is about 15 ^g to about 4 - ug . 
A suitable dose size is about 0.5 ml. Accordingly, a 
dose for intramuscular injection, fcr example, would 
comprise 0.5 mi containing 45 ug of antigen^ in 
30 admixture with 0.5% aluminum hydroxide. After 

formulation, the vaccine may be incorporated into a 
sterile container which is then sealed and stored at 
a low temperature, fcr example 4°C, cr it may be 
freeze-dried. Lycphi 1 izaticn permits long-term 

-a 5 storaae in a stabilized form.. 
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The vaccines may be administered by any conver.:::r.a. 
method for the administration of vaccines including 
oral and parenteral ie.gr.. subcutaneous cr intra- 
muscular) injection. Intramuscular administratis, is 
5 preferred. The treatment may consist of a single dose 

"of vaccine or a plurality of doses over a period cf 
-ime. It is preferred that the dose be given to a 
human patient within the first 8 months of life. The 
antigen of the invention can be combined with 
10 appropriate doses of compounds including influenza 

antigens, such as influenza type A antigens. Also, 
the antigen could be a component of a recombinant 
vaccine which could be adaptable for oral 
administration . 

1 5 

Vaccines of the invention may be combined with ctner 
vaccines for other diseases to produce multivalent 
vaccines. A pharmaceut ically effective amount of the 
antigen can be employed with a pharmaceut ically 
20 acceptable carrier such as a protein or diluent useful 

for the vaccination of mammals, particularly humans. 
Other vaccines may be prepared according to metnods 
well-known to those skilled in the art. 



Those of skill will readily recognize that it is cnly 
necessary to expose a mammal to appropriate epitopes 
in order to elicit effective immunoprotect ion . The 
epitopes are typically segments of amino acids which 
are a small portion of the wnole protein. Using 
30 recombinant genetics, it is routine to alter a natural 

protein's primary structure tc create derivatives 
embracing epitopes that are identical to or 
substantially the same as (immunologically equivalent 
~o) the naturally occurring epitopes. Such 
35 derivatives may include peptide fragments, ammo acid 

substitutions, amino acid deletions and ammo acid 



additions 



of the amino acid sequence for the viral 
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polypeptides from the human herpesvirus. For example , 
it is known in the protein art that certain ammo acic 
residues can be substituted with amino acias ci 
similar size and polarity without an undue effect upon 
5 tne biological activity of the protein. .The human 

herpesvirus polypeptides have significant tertiary- 
structure and the epitopes are usually conformational. 
Thus, modifications should generally preserve 
conformation to produce a protective immune response. 
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B . Antibody Prophylaxis 



Therapeutic, intravenous, polyclonal or monoclonal 
' antibodies can been used as a mode of passive 
15 immunotherapy of herpesviral diseases including 

oermatal varicella and CMV . Immune globulin from 
persons previously infected with the human herpesvirus 
and bearing a suitably high titer of antibodies 
aaainst the virus can be given in combination with 
20 antiviral agents (e.g. ganciclovir), or m combination 

with other modes of immunotherapy that are currently 
being evaluated for the treatment of KB. which are 
taraeted to modulating the immune response (i.e. 
treatment with copolymer-:, ant i idiotypic monoclonal 
antibodies, T cell "vaccination"). Antibodies to 
human herpesvirus can be administered to the patient 
as described herein. Antibodies specific fcr an 
epitope expressed on cells infected with the human 
herpesvirus are preferred and can be obtained as 
30 described above. 

A polypeptide, analog cr active fragment can be 
formulated into the therapeutic composition as 
neutralized pharmaceut ically acceptable salt forms. 
35 Pharmaceuticaliy acceptable salts include the acia 

addition salts (formed with the free amino groups of 
the polypeptide or antibody molecule' and which are 
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formed with inorganic acids such as.- for exar.pie . 
hydrochloric or phosphoric acids, cr such organic 
acids as ace::c , oxalic, tartaric, mandeiic , and the 
like. Sains formed from the free carboxyi groups car. 
= also be derived from inorganic bases such as, for 

examDle, sodium, potassium, ammonium, calcium, or 
ferric hydroxides, and such organic bases as 
isopropylamine , trimethylamme , 2-ethylammo etnanoi, 
histidine, procaine, and the like. 



10 
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C . Monitoring Therapeut ic Efficacy 



This invention provides a method for monitoring tne 
' tnerapeutic efficacy of treatment for Kaposi's sarcoma 
15 which comprises: (a) determining in a first sampie 

from a subject with Kaposi's sarcoma the presence of 
the isolated nucleic acid molecule; (b) administering 
to the subject a therapeutic amount of an agent such 
that the agent is contacted tc the cell m a sampie; 
20 (c; determining after a suitable period of time the 

amount cf the isolated nucleic acid molecule in the 
second sampie from the treated subject: and (d; 
comparing the amount of isolated nucleic acid molecule 
determined in the first sample with tne amount 
25 determined m the second sample, a air.rerer.ce 

indicating the effectiveness of the agent, thereby 
monitoring the therapeutic efficacy cf treatment for 
Kaposi's sarcoma. As defined herein "amount" is viral 
load or copy number. Methods of determining viral 
3 0 load or copy number are known tc those skilled in the 

art . 

VII . Screening Assays For Pharmaceuticals for 

Alleviating the Symptoms of KS 



Since an agent involved in the causation or 
progression of KS has been identified and described, 
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assays directed "c identifying potential 
pharmaceutical agents that inhibit the biological 
activity of the agent are possible. KS drug screening 
assays which determine whether or not a drug has 
5 activity against the virus described herein are 

contemplated in this invention. Such assays comprise 
incubating a compound to be evaluated for use m KS 
treatment with ceils which express the KS associated 
human herpesvirus polypeptides or peptides and 
10 determining therefrom the effect of the compound on 

tine activity of such agent. In vitro assays in which 
the virus is maintained in suitable cell culture are 
©referred, though in vivo animal models would also be 
effective . 

Compounds with activity against the agent of interest 
or peptides from such agent can be screened in in 
vitro as well as in vivo assay systems. In vitro 
assays include infecting peripheral blood leukocytes 
cr susceptible T cell lines such as MT-4 with the 
aaent of interest ir. the presence cf varying 
concentrations of compounds targeted against viral 
replication, including nucleoside analogs, cnam 
terminators, antisense oligonucleotides and random 
polypeptides (Asada et al . , 1989, J. Clin. Microbiol. 
27, 2204; Kikuta et al . , 1989, Lancet Oct. 7, 361). 
Infected cultures and their supernatants can be 
assayed for the total amount of virus including the 
presence cf the viral genome by quantitative PGR, by 
30 dot blot assays or by using immunologic methods. For 

example, a culture cf susceptible cells could be 
infected with KSHV in the presence of various 
concentrations of drug, fixed on slides after a period 
of days, and examined for viral antigen by indirect 
3 5 immunofluorescence with monoclonal antibodies to viral 

polypeptides (Kikuta et al . , supra). Alternatively, 
chemically adhered MT-4 cell monolayers can be used 
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for an infectious agent: assay using in - :re:: 
immunofluorescen: antibody staining to search for 
focus reduction (Higashi et al . , 198S. C . dir.. Kizrc. 
27 , 2204) . 

5 

As an alternative to whole cell in vitrc assays, 
purified KSHV enzymes isolated from a host ce__ cr 
produced by recombinant techniques can be used as 
targets for rational drug design tc determine the 
10 effect of the potential drug on enzyme activity. K5HV 

enzymes amenable to this approach include, but are net 
limited to, dihydrofolate reductase OKFR) , 
thymidyiate synthase (TS) , thymidine kinase or DNA 
polymerase. A measure of enzyme activity indicates 
effect on the agent itself. 



Id 
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Drug screens using herpes viral products are known and 
have been previously described m E? C51483C (herpes 
proteases) and WO 94/04 920 (U L I3 gene product'; . 

This invention provides an assay for screening anti-KS 
chemotherapeutics. Infected cells can be incubated in 
the presence of a chemical agent that is a potential 
chemotherapeutic against KS (e.g., acycio-guar.osine) . 
The level ' of virus in the cells is -hen determined 
after several days by immunofluorescence assay for 
antigens, Southern blotting for viral genome DNA or 
Northern blotting for mRNA and compared to control 
cells. This assay can quickly screen large numbers of 
chemical compounds that may be useful against KS . 



Further, this invention provides an assay system that 
is employed to identify drugs or other molecules 
capable cf binding to the nucleic acid molecule or 
3 5 proteins, either in the cytoplasm or in the nucleus, 

thereby inhibiting cr potentiating transcriptional 
activity. Such assay would be useful m tne 
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development of drugs that would be specific against 
particular cellular activity, or that would potentiate 
such activity, in time or in level of activity. 

5 This invention provides a method of screening tcr a 

KSHV- selective antiviral drug in vivo comprising: ia) 
expression of KSHV DHFR or KSHV TS in a bacterial 
auxotropn (nutritional mutant); (b! measuring 
bacterial growth rate in the absence and presence of 
the drug; and (c) comparing the rates so measured sc 
as to identify the drug that inhibits KSHV DHFR or 
KSHV TS in vivo. 



10 



Methods well known to those skilled in the art allow 
15 s 



3C 



election or production of a suitable bacterial 



auxotropn and measurement of bacterial growtn. 

The following reviews of antifoiate compounds are 



provided to more fully describe the state ci tne art, 



20 particularly as it pertains to inhioitcrs o, 

dinydrcfolate reductase and thymidylate synthase: la; 
Unaer , 1996 , Current concepts of treatment m medical 
oncoiogv: new anticancer drugs, Journal of Cancer 
Research & Clinical Oncology- 122, 189-19S; (b) 
25 Jackson, 1995, Toxicity prediction from metabolic 

pathway modelling, Toxicology 102, 157-205; ic) 
Schultz, 1995, Newer antifoiates in cancer therapy. 
Progress in Drug Research 44, 129-157; (d) var. der 
Wilt and Peters, 1994. New targets fcr pyrimicme 
antimetabolites in the treatment cf solid tumours 1: 
Thymidylate synthase, Phanr. World Sci IS, 167; (e) 
Fleisher, 1993, Antifoiate analogs: mechanism of 
action, analytical methodology, and clinical efricacy, 
Theraoeuzic Drug Monitoring 15, 521-526; O r,ggot_ e^. 
35 a 2 1993, Antifoiates in rheumatoid arteritis: a 

hypothetical mechanism of action, Clinical £ 
Experimental Rheumatology 11 Suppl e, S 1 0 1 - S 1 0 = ; (g; 
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1992, Wacure 355, 362-365). that can substitute :cr 
human eye i in D in phospnoryiating the retinoblastoma 
tumor suppressor protein. 

5 KSHV encodes a functionally- active IL-6 (ORF. K2 .- and 

-wo macrophage inflammatory proteins (MIPs) (ORFs K4 
and K€) which are not found in other human 
herpesviruses. The vIL-6 has 62% amine acid 

similarity to the human IL-6 and can substitute for 
10 human IL-6 m preventing mouse myeloma ceil apeptcsis. 

Both MlP-iifce proteins have conserved C-C dimer 
signatures characteristic of /S -chemofcir.es and near 
sequence identity to human MlP-lor in their N- terminus 
regions. vMIP-I (ORF K6 ) can inhibit CCR-5 dependent 
15 HIV- 1 replication. An open reading frame spanning 

nucleotide numbers (bp) 22,529-22,185 (vMIF-III- has 
low conservation with MIP 1/3 (BLASTX poisson p=0.CCi5) 
buI r e-ains the C-C dimer motif. ORF K9 ivIRFl) 
encodes a 449 residue protein with similarity to -the 
20 family of interferon regulatory factors ( IRF )' {David, 

1995, Phanuac. Ther .\ S3 , 149-161). It has 13.4% amino 
acid identity to human interferon consensus sequence 
binding protein and partial conservation of the IRF 
DNA-binding domain. Three additional open reading 
25 frames at bp 6 3.910-86,410 (vIRF2). bp 9 0 , 54 1 - 6 9. 5 0 C 

( v IRF 3 and bp 94.127-93,636 (vIRF4- also have low 
similarity to IRF-iike proteins (p > 0.35). No 
conserved interferon consensus sequences were found in 
■this reaion cf the genome. 

30 

Other cenes encoding signal transduction polypeptides , 
which are also found in other herpesviruses, include 
a complement -binding protein <v-C3P, ORF 4), a neural 
cell adhesion molecule (NCAM)-iike protein (v-adh, ORF 
35 K14) and an ILB receptor (ORF 74). Genes similar to 

ORFs 4 and 74 are present in other rhadinoviruses ano 
CRF 4 is similar to variola 319L and D12L proteins. 
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Huennefcens ec al . . 1952, Membrane transport of folate 
compounds, Journal of Nutritional Science & 
Vitaminology Spec No, 52-57; (h) Fleming and Schilsky. 
1992, Antif elates: the next generation, Seminars ir. 
Oncology 19,- 707-719; and (i) Bertmo et ah. ; -9 92, 
Enzymes of the thymidylate cycle as targets for 
chemotherapeutic agents: mechanisms of resistance, 
Mount Sinai Journal of Medicine 59, 391-355. 

This invention provides a method of determining the 
health of a subject with AIDS comprising: la) 
measuring the plasma concentration of vMIP-I, vMI?-H 
or vMIP-IH; and <b) comparing the measured value to 
a standard curve relating AIDS clinical course to the 
measured value so as to determine the health of the 
subject . 



VIII- ^reatm^r-.t of HIV 



20 



a method of inhibiting HIV 



This invention provides 
replication, comprising administering to the subiect 
or treating cells of a subject with an effective 
amount of a polypeptide which is encoded by a nucleic 
25 acid molecule, so as to inhibit replication of HIV. 

• m one embodiment, the polypeptide is one from the 
list provided in Table 1. 



30 This inversion is further illustrated m tne 

Experimental Details Sections which follow. These 
sections are set forth to aid in understanding the 
invention but is not intended to, and should not be 
construed to. limit in any way the invention as set 

35 forth in the claims which follow thereafter. 



EXPERIMENT AT, DETATT.fi SECTION I 
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NUCLEOTIDE SEQUENCE OF THE KAPOSI'S SARCOMA -ASS ZZ Z ATEE 
HERPESVIRUS 

Th e genome of the Kaposi's sarcoma - associated 
5 herpesvirus (KSHV or HHV6 ) was mapped with cosmic ana 

phage genomic libraries from the BC-1 cell line. Irs 
"nucleotide sequence was determined except for a 3 ko 
Region at tne right end of the genome that was 
refractory to cloning. The BC-1 KSHV genome consists 
of a 140.5 kb long unique coding region ( L.UR ) flanxec 
by multiple G+C rich 801 bp terminal repeat sequences. 

genomic duplication that apparently arose in the 
parental tumor is present in this cell culture -deriveo 
strain- At least 81 open reading frames ICRFs). 
including 66 with similarity to herpesvirus saimiri 
ORFs, and 5 internal repeat regions are present in the 
LUR. The virus encodes genes similar to 

complement -binding proteins, three cytokines (two 
macroohaae inflammatory proteins and mterleukm- 6 ) , 
dihydrcfolate reductase, bcl-2, interferon regulatory 
factor, XL- 8 receptor, NCAM-like adhesm, and a D-type 
cyclin, as well as viral structural and metabolic 
proteins- Terminal repeat analysis of virus DNA from 
"a KS lesion suggests a monoclonal expansion of KSHV m 
che KS tumor. The complete genome sequence is set 
forth in Genbank Accession Numbers U756S5 ( LUR ) 
U75699 (TR) and U75700 (ITR) . 



A 



Kaposi's sarcoma is a vascuia. ■>- ■ ■ - 

composition (Tapperc er al . . 1953. J- Air.. Acaz . 
n-rmaccl. 28, 371-395). The histology and relatively 
benign course in persons without severe 
immunosuppression has led to suggestions that KS^tumor 
cell proliferation is cytokine induced (Ensoli et al - , 
1992, Immunol . Rev. 127. 147-155) . Epidemiologic 

studies indicate the tumor is uncer s -i 

control and is likely to be caused by a sexuaii> 
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rransm:u-ed infectious agent otner than KIV { ?e terms.-. 
et al., 1953, AIDS 7, 605-611). KS-asso=iatec 
Herpesvirus (KSKV) was discovered in an AIDS-KS -esicr. 



10 



bv representational difference analysis (RDA- 



an; 



shown to be present in almost all AIDS-KS lesions 
(Chang et aJ., 1994. Science 265, 1865-1869). These 
findings have been confirmed and extended to nearly 
all KS lesions examined from the various epidemiologic 
classes of KS (Boshoff et al . , 1995, Lancet 345 
1043-1044 ; Dupm et al . , 1995, Lancer 345, 751-752 
Moore and Chang, 1995, New Eng. J- Med. 332 
1181-1155; Schalling et al . , 1995, Nature Med. 1 
707-708: Chang et al . , 1996, Arch. Inc. Med. 156 
202-204;. KSKV is the eighth presumed numan 

herpesvirus (HHVB) identified tc date. 



The virus was initially identifed from two herpesvirus 
DNA fragments. KS330Bam and KS£31Bam (Chang e: al . , 
1994 , Science 265, 1865-1869). Subsequent sequencing 
2C of a 21 kb AIDS-KS genomic library fragment ( KS5 ) 

hybridizing to KS3303am demonstrated that KSKV is a 
gammaherpesvirus related to herpesvirus saimiri (HVS) 
belonging to the genus Rhadmovirus (Moore et al . , 
1996, J. Vircl . 7C, 549-558). Colinear similarity 
2b (synteny) cf genes in this region is maintained 

between KSKV and HVS , as well as Epstein- Barr virus 
{ E3V} and equine herpesvirus 2 (EHV2). A 12 kb region 
(L54 and SGL-1) containing the KS631Batr, sequence 
includes cyclm D and IL-8Ra genes unique to 
30 rhadinoviruses . 

KSHV is not readily transmitted to uninfected cell 
lines (Moore ec al . , 1996, J. Virol. 70, 549-556), 
but it is present in a rare B cell primary effusion 
35 (body cavity-based) lymphoma (PEL) frequently 

associated with KS (Cesarean et al . . 1995. New Eng. J . 
Med. 332, 1186-1191). 3C-1 is a PEL cell line 
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containing a high KSHV genome copy number and is 
coinfected with EBV (Cesarman e: al . , 1995- Sloos 66. 
2706-2714) ■ The KSHV genome form in BC-1 ar.a its 
Javental tumor comigrates with 270 kb linear -arkers 
on pulsed field gel electrophoresis ! FFGE ) (Moore ez 
al.. 1996. J- Virol. 70. 545-558). However, the 
a=nome size based on encapsidated DNA froir. an 
EBV-negative cell line (Renne ez al . . 1996, Nature 
Med. 2. 342-346) is estimated to be 165 kb (Moore s: 
10 al., 1996, J. Virol. 70, 549-558). Estimates fromKS 

lesions indicate a genome size larger than that o: E3V 
!172 kb) (Decker et al . , 1996, J ■ Exp. Med. 184, 
2 S 3 - 2 8 8 ) • 



15 



20 



To determine the genomic sequence of KSHV and identify 
novel virus genes, contiguous overlapping virus DNA 
inserts from BC-1 genomic libraries were mapped. With 
the exception of a small, uncionabie repeat region at 
its right end, the genome was sequenced to high 
redundancy allowing definition of the viral genome 
structure and identification of genes that may play a 
role in KSHV- related pathogenesis. 



25 



30 



35 



MATERIALS AND METHODS 

Library generation and screening. 3C-1, H3L-6 and 
BCP-l 'cells were maintained in RPMI 1640 with 20% 
fetal calf serum (Moore en al . , 1996, J. Virol. 70, 
545-556; Cesarman ez al., 1995, Blood 66. 270S-2714; 

a i " 995. Nazure Med. 2, 925-925). DNA from 
BC-1 cells was commercially cloned (Sam^rook e: al . , 
1989, Molecular Cloning: A laboratory manual, Cold 
Spring Harbor Press, Salem, Mass.) into either Lambda 
FIX II or S-Cosl vectors (Stratagene, La JoIIa, CA) . 
Phage and cosmid libraries were screened by standard 
methods (Benton et al . , 1977, Science 196, 180-162 ; 
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Kanahan and Meselson, 1983, Methods Enzyme 1 . 100. 
333-342} . 

Initial library screening was performed using me 
5 KS330Bam and KS631Bam RDA fragments (Chang e: 

1994, Science 265, 1865-1869). Overlapping clones 
were sequentially identified using probes synthesized 
from tne ends cf previously identified clones ; Figure 
l; (Feihberg and Vogelstein, 19 es , Anal . Biochex. 132, 
10 6; Melton et ai . ( 1984, Nucl . Acids Res. 12, 

7035-7056) - The map was considered circularly 

permuted by the presence of multiple, identical TR 
units in cosmids Z2 and 26. Each candidate phage or 
cosmic was confirmed by tertiary screening. 

1 5 

ghotaun sequencing and seque nce verification 

Lambda and cosmid DNA was purified by standard methods 

(Sambrook e: ai . , 1989, Molecular Cloning: A 
20 laboratory manual. Cold Spring Harbor Press, Salem, 

Mass.;. Shotgun sequencing (Iteininger, 19S2 .. Anal. 

Biochem. 129. 216-223; Bankier et al . . 1987, Meth. 

Enzvmol. 15 5, 51-93) was performed on sonicated DNA. 

A 1-4 kb fraction was subcloned into M13mpl9 (New 
25 England Bioiabs, inc., Beverly, KA) and propagated in 

XL1 -Blue cells (Stratagene, La Jolia, CAi (Samrrook er 
1989, Molecular Cloning: A laboratory manual. 

Cold Spring Harbor Press, Salem.. Mass.) Mil phages 

were positively screened using insert DNA from the 
30 ohage or cosmid, and negatively screened with vector 

arm DNA or adjacent genome inserts. 

Automated dideoxy cycle sequencing was performed with 
K13 i-21) CSr or FS dye primer kits ! Perkin- Elmer , 
35 Branchburg NJ) on ABI 373A or 377 sequenators (A3I, 

Foster City, CA) . Approximately 300 MI3 sequences 
were typically required to achieve initial coverage 
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-o- each 10 kb of insert sequence- Minimum sequence 
fidelity standards were defined as complete, 
bidirectional coverage with at least overlapping 
sequences at any given site. For regions with 
5 sequence gaps, ambiguities or frameshifts that cio not 
mee - these criteria, primer walking was done with 
custom primers (Perkin-Eimer) and dye terminator 
cnemistry <FS or Ready Reaction kits. Perkm-Elmer ) . 
An unsequenced 3 kb region adjacent to the right end 
10 TR sequence in the Z2 cosmid insert could net be 
cloned into M13 or Bluescript despite repeated 
efforts. 

' g-m^nce a°^n-v a r n nnsn reading frame analysis 

Sequence data were edited using Factura (ABI , Foster 
City. CA5 ana assembled into contiguous sequences 
using electropherograms with AutoAssembler iA3I, 
Foster City. CA) and into larger assemblies with 
20 AssembivLIGN (IEI -Kodak. Rochester NY) . Base 

positions not clearly resolved by multiple sequencing 
attempts (less than 10 bases in totali were assigned 
-he majority base pair designation. The entire 
sequence Sir. 1-3 kb fragments) and all predicted open 
25 -eadinc frames (ORFs; were analyzed using B1ASTX. 

3 LAST? and 3IA3TN (Altschul e: al - . 1990. J. Vol. 
Biol. 215, 403-410). The sequence was further 
analyzed using MOTIFS (Moore et al . . 1S9S, J. Virol. 
7C, 543-558;, REPEAT and BESTFIT (GCG) , and KacVector 
3 0 (IB". New Haven, CT) . 

r ya? assignment and nomenc lature 

All ORFs with similarities to KVS were identified. 
25 These and other potential ORFs having »1C0 amino acids 
were found using MacVector. ORFs not similar to KVS 
ORFs were included in the map (Fig. L> based on 
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similarity to other known genes, optimum initiation 
codon context (Kozak, 1987, Nucl . Acids Res. 15, 
8125-8148), size and position. Conservative 
selections were made to minimize spurious assignments ; 

5 this underestimates the number of true reading frames. 

KSHV ORF nomenclature is based on HVS similarities; 
KSHV ORFs not similar to HVS genes are numbered in 
consecutive order with a K prefix. ORFs with sequence 
but not positional similarity to HVS ORFs were 

10 assigned the HVS ORF number (e.g.. ORF 2). As new 

ORFs are identified, it is suggested that they be 
designated by decimal notation. The standard map 
cr lent at ion (Fig. 1) of the KSHV genome is the same as 
for HVS (Albrecht et al . . 1992, J. Vircl - £6, 

15 5047-5055) and EHV2 (Telford e: al . , 1S&5, Mol . 

Biol. 245, 520-528), and reversed relative tc the EBV 
standard map (Baer et al . , 1984, Nature 310, 207-211). 



RESULTS 

Genomic maooina and sequence characteristics 

Complete genome mapping was achieved with 7 lambda and 
3 cosmid clones (Fig. 1). The structure of the BC-1 
KSHV genome is similar to HVS in having a long unique 
region { LUR ) flanked by TR units. The -14 0.5 kb LUR 
sequence has 53.5% G+C content and includes all 
identified KSHV OR^s . TR regions consist of multiple 
8 01 bp direct repeat units having 64.5% G^C content 

(Fig- 2A) with potential packaging and cleavage sites. 
Minor sequence variations are present among repeat 
units. The first TR unit at the left iZi : TR junction 

(2 05bp) is deleted and cruncated in 3C-1 compared to 

the prototypical TR unit . 

The genome sequence abutting the right terminal repeat 
region is incomplete due to a 3 kb region in the 22 
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cosmid insert that could not oe cloned into sequencing 
vectors. Partial sequence information from primer 
walking indicates that this region contains stretcnes 
of 16 bp A+G rich imperfect direct repeats 
5 interspersed with at least one stretch of 16 op C+T 

rich imperfect direct repeats. These may form a 
laraer inverted repeat that could have contributed to 
our" difficulty m subcicning this region. Greater 
than 12 -fold average sequence redundancy was achieved 
10 for the entire LUR with complete bidirectional 

coverage by at least 4 overlapping reads except in the 
unclonable region. 

The BC-I TR region was examined by Southern bictting 
3 ,n-e sequencing of the entire region is not possible 
due"cc "its repeat structure. BC-1, BCP-1 (an 

EBV-negative, KSHV infected cell line) and KS lesion 
DMAs have an intense -800 bp signal consistent with 
unit lenath repeat sequence when digested with 
20 enzymes that cut once m the TR and hybridized to a TR 

crobe {Figs. 25 and 2C). . Digestion with enzymes that 
do not cut in the TR indicates that the BC-1 strain 
contains a unique region buried in the TR , flanked by 
-7 kb and -35 kb TR sequences (Figs. 2C and 2D). An 
25 identical pattern occurs in HSL-6, a cell line 

independently derived from the same tumor as BC-1, 
suoaesting that this duplication was present in the 
parental tumor (Figs. 2C and 2D). The restriction 
pattern with Not I, which also cuts only once within 
30 th- TR but rarely within the LUR. suggests that the 

buried region is at least 33 kb . Partial sequencing 
of this region demonstrates that it is a precise 
aenomic duplication of the region beginning at ORF KB. 
The LUR is 14 0 kb including the right end unsecuenced 
25 gap <<3kb) . The estimated KSHV genomic size in 3C-1 

and H3L-6 (including the duplicated region) is 
aooroximately 210 kb . 
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3 ased on the EBV replication model usee in clo: 
studies {Raab-Traub and Flynn, 1585. Cell 4", 
863-889) , the polymorphic 3CF-1 laddering pattern may 
reflect lytic virus replication and super inrect ion 
(Fia. 2C) • The EBV laddering pattern occurs when TR 
units are deleted or duplicated during lytic 
replication and is a stochastic process for each 
infected cell (Raab-Traub and Flynn, 1936, Ceil 47. 
853-8891 . No laddering is present for 3C-1 which is 
under tight latent KSHV replication control (Moore e: 
al., 1995, J. Virol. 70, 549-558). KS lesion DNA 
also shows a single hybridizing band suggesting that 
virus in KS tumor cells may be of monoclonal origin. 



rogrures and coding regions o 



f the KSHV LUR 



The KSHV genome shares the 7 block (B) organization 
•31-37, Fig. 1) of other herpesviruses ( Chee e: al . , 
1990, Curr . Topics Microbial. Immunol. 154, 125-16 9) , 
20 with sub- family specific or unique ORFs present 

between blocks (interblock regions (13.' a-n, Fig. 
ORF analysis indicates that only 79% of the sequenced 
137.5 kb LUR encodes SI identifiable ORFs which is 
likely to be due to a conservative assignment of ORF 
25 positions! The overall LUR CpG dinuciectide 

ooserved/expected (C/E; ratio is 0.75 consistent with 
a moderate loss of methylated cytosines, but there is 
marked regional variation. The lowest CpG O/E ratios 
(<0.67) occur m IBa (bp 1-3200). m 35 
30 (68,602-69,405: and I3h (117,352-137,507;. Tne 

■ highest O/E ratios (>C.Sej expend from 32 to 33 
; 30 , 701-47 , 849 ) , in IBe (67,301-66,600;, and in 36 
(77,251-83,600). Comparison to the KS5 sequence 
(Moore et al . , 1996, J. Virol. 70, 549-558) shows a 
25 high sequence conservation between these two strains 

with only 21 point mutations over the comparable 20. / 
kb region (0.1%). A frameshift within BC-1 ORF 25 
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■.position 49,004) compared to KS5 ORF 25 was ir- 
resolvable despite repeated sequencing cf KS5 anc ?Cr. 
products amplified from BC-1 - Twc addicicna. 

frameshifts in noncoding regions (bp 47.3S2 ana 
5 49,338) are also present compared to. the KS5 sequence. 
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Several repeat regions are present in the Lu7 (Fig. 
1} A 14 3 bp sequence is repeated within ORF Kll at 
positions 92,678-92,820 and 92,852-92,994 (waka/jwka). 
Comdex repeats are present in other regions of the 
genome: 20 and 30 bp repeats in the region from 
24,285-24,902 (fmk), a 13 bp repeat between bases 
29,775 and 29,942 (vnct) , two separate 23 op repeat 
stretches between bases 118,123 and 118,69^ iz??a), 
i 5 and 15 different 11-16 bp repeats throughout the 

region from 124,527 to 126,276 (moil. A complex A-G 
rich repeat region (mdsk) begins at 137,099 and 
extends into the unsequenced gap . 



Conserved ORFs with similar genes founo m ccner 
herpesviruses are listed in Table 1, along with their 
polarity, map positions, sizes, relatedness to HVS and 
E3V ORFs . and putative functions. Conserved ORFs 
codinc for viral structural proteins and enzymes 
25 include genes involved in viral DNA replication (e.g., 

DNA polymerase (ORF 9)), nucleotide syntnesis (e.g., 
dihvdrofolate reductase ( DHFR , ORF 2), thymidylate 
synthase (TS. ORF 70)), regulators of gene expression 
CE transactivator ( LCT? , ORF50); and 5 conserved 
30 herpesvirus structural capsid and 5 glycoprotein 

genes . 

Several genes that are similar to HVS ORFs also have 
unique features. ORF 45 has sequence similarity to 
25 nuclear and transcription factors {chick nucieolm and 

yeast SIR3) and has an extended acidic domain cypica- 
-or transactivator proteins between amino acics 90 ana 
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115. ORF7 3 also has ar. extended acidic certain 
separated into two regions by a giutamme - ri or. 
seguence encoded by the moi repeat. The first region 
consists almost exclusively of aspartic and glutamic 
acid residue repeats while the second glutamic acic 
rich region has a repeated leucine heptac motif 
suagestive of a leucine zipper structure. ORF 73, a 
putative tegument protein, has a high ieve_ cr 
similarity to the purine biosynthetic enzyme cf £. 
coll and D. melanogaster N- f ormylglycmamide ribotide 
amidotransf erase (FGARAT) . 



ORFs K3 and K5 are not similar to HVS genes but are 
similar to the major immediate early bovine 
15 herpesvirus type 4 { BHV4 ) gene IEI (12 and 13% 

identity respectively) (van Santen, 1991, J. Virol. 
65, 5211-5224). These genes have no significant 
similarity to the herpes simplex virus Z (HSV1; aC 
(which is similar to BHV4 IEI). but encode proteins 

20 sharing with the HSV1 ICPO protein a cysteine - rich 

region which may form a zinc finger motif (van Sancen, 
1991, J. Virol. 65, 5211-5224). The protein encoded 
by ORF K5 has a region similar tc the nuclear 
localization site present in the late form of the 3HV4 

25 protein. ORF KB has a purine binding motif ( GLLVTGKS ; 

in the C- terminus of the protein which is similar to 
a motif present in the KSHV TK (ORF21) (Moore e: al . , 
1996, J . Virol. 70. 549-556). 

30 No KSHV genes with similarity to HVS ORFs 1. ^ , = . -2, 

13, 14. 15, 51 and 71 were identified in the KSHV LUR 
sequence. HVS ORF 1 codes for a transforming protein, 
responsible for HVS- induced in vitro lymphocyte 
transformation (Akari e: al . , 1996, Virology 213 f 

35 362-388) and has poor sequence conservation among HVS 

strains (Jung and Desrosiers, 1991, S. Virol. 65, 
6953-6960; Jung and Desrosiers, 1995, Moleo. Cellular 



PCT/L : S97/13346 

WO 98/04576 



10 



Id 



20 



25 



30 



3d 



98 



Biol. 15. 6506-6512). Functional KSKY genes similar 
to this gene may be present but were not iden ;; fiabl * 
by sequence comparison. Likewise, no KSHl genes 
similar to EBV latency and transf ormat ion- associatec 
proteins {EBNA- 1 , EBNA-2, EBNA-LP , LMF-1, -LKF-2 or 
gP 350/220) were found despite some similarity to 
repeat sequences present in these genes. KSHV also 
does not have a gene similar to the BZLFi r3V 
-ransactivator gene. 

Several sequences were not given ORF assignments 
although they have characteristics of expressed genes. 
The sequence between bp 90,173 and 90,643 is similar 
— th-" orecursor of secreted glycoprotein X IgX) , 
encoded * by a number of alphaherpesviruses 
toseudorabies, EHV1 ) , and which does not form par- oi 
the virion structure. Like the cognate gene in EHV1 , 
the KSKY form lacks the highly- acidic carboxy terminus 
of the oseudorabies gene. 

Two oolyaaenylated transcripts expressed at high copy 
number m 3CBL-1 are present at positions 
26,651-29.741 iTl.l) in I3b and 116, 13 G -11 7, 43c [TO .1 ^ 
in ISh. TO. 7 encodes a 60 residue polypeptide iORF 
K12, also called Kaposin) and Tl . 1 (also rererrec to 
as nut-l) has been speculated to be a U RNA-iike 
transcript . 

r-o' ^v-le regulation, a n d cell Fisr.anr.c prc:e:r.s 

A number of ORFs which are either unique to KSKY or 
shared only with otner gammaherpesviruses encode genes 
similar to oncoproteins and cell signaling proteins. 
ORF 16, similar to EBV 3HRF1 and KVS OR.- 15, encodes a 
functional Bel- 2 -like protein which can inhibit 
Bax-mediated apoptosis. ORF 72 encodes a functional 
cyclin D gene, also found in KVS (Nicholas at a-t . , 
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1992, Nature 355, 362-365;. that can substitute fcr 
human eye I in D in phospnoryiating the retinoblastoma 
tumor suppressor protein. 

5 KSHV encodes a functionally- active IL-6 (ORF K2 and 

lvo macrophage inflammatory proteins (MIPs) (ORFs K4 
and K6) which are not found in otner human 
herpesviruses. The vIL-6 has 62% amine acid 

similarity tc the human IL-6 and can substitute for 
10 human IL-6 m preventing mouse myeloma cell apeptcsis . 

Both MlP-iike proteins have conserved C-C dimer 
signatures characteristic of 0-chemokines and near 
sequence identity to human MlP-la in their N- terminus 
regions. vMIP-I <ORF K6 ) can inhibit CCR-5 dependent 
15 KIV-l replication. An open reading frame spanning 

nucleotide numbers (bp) 22,529-22,185 ivMIP-III' has 
low conservation with MIP IP (BLASTX poisson p=G.0C15) 
but retains the C-C dimer motif. ORF KS tvIRFI) 
encodes a 449 residue protein with similarity tc the 
20 family of interferon regulatory factors ( IRF ) (David, 

1995, Pharmac. Ther . €5, 149-161). It has 13.4% ammo 
acid identity to human interferon consensus sequence 
binding protein and partial conservation of the IRF 
DNA-binding domain. Three additional open reading 
25 frames at bp 8 3.910-85,410 (vIRF2), bp 90,541-69.600 

;vIRF3 and bp 94.127-93,636 (vIRF4; also have low 
similarity to IRF-Iike proteins (p > 0.35). No 
conserved interferon consensus sequences were found in 
this real on cf the genome . 
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Other cenes encoding signal transduction polypeptides, 
which are also found in other herpesviruses, include 
a complement -binding protein fv-C3P, ORF 4 ; , a neural 
cell adhesion molecule (NCAM)-Iike protein (v-adh. ORF 
K14) and an IL8 receptor (ORF 74). Genes similar to 
ORFs 4 and 74 are present in other rhacinoviruses and 
ORF 4 is similar to variola S19L and D12L proteins. 
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CRF KI4 (v-adh) is similar to the rat and human CX-2 
membrane antigens, various KCAMs and the poliovirus 
receptor-related protein PRR1 . OX-2 is ir. turn 
similar to ORF UB5 of human herpesviruses 6 and " but 
5 there is no significant similarity between the KSKV 

and betaherpesvirus OX-2/NCAM ORFs . Like ether 
immuncglobulm family adhesion proteins, v-adh has 
V-like, Olike, transmembrane and cytoplasmic domains, 
and an RGD binding site for fibronectin at residues 
IC 268-270. The vIL-8R has a seven transmembrane 

spanning domain structure characteristic of G-prctein 
coupled chemoattractant receptors which includes the 
E3V-induced EBI1 protein (Birkenbach ez al . , 1952. J- 
Virol . 61 , 2209-2220) . 
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DISCUSSION 

The full-length sequence of the XSHV genome m 3C-1 
cells provides the opportunity to investigate 
molecular mechanisms cf KSHV-associated pathogenesis. 
The KSKV genome has standard features cf rhadmovirus 
genomes including a single unique coding region 
flanked by high G-2 terminal repeat regions which are 
the presumed sites for genome circular izat lor. . In 
addition to navmg co c-nse-\_— * — ^ — 
involved in herpesvirus replication and structure, 
KSKV is unique in encoding a number of proteins 
mimicking cell cycle regulatory and signaling 
croteins . 

Our estimated size of the BC-1 derived genome (210 kb 
including the duplicated portion; is consistent with 
that found using encapsidated virion DNA (Zhong e: 
a!., !99o t Proc. Natl. Acad. Sci . USA S3, 6641-5646). 
Genomic rearrangements are common in culiurec 
herpesviruses (Baer e: al . , 1984, Nature 310, 207-211 ; 
Cha'ecaJ., 1956, J. Virol . 70, 75-83). However, the 
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genomic duplication present in the EC-1 KSKV prcoably 
did not arise during tissue culture passage. 7F. 
hybridization studies indicate that this insertion c: 
a duplicated LUR fragment into the BC-l TR is also 
present in KSHV from the independently derived K=L.-e 
"cell line (Gaidano et al . , 199€, Leukemia 10, 
1237-40) • 

Despite this genomic rearrangement, the KSHV genome is 
well conserved within coding regions. There is .ess 
than 0.1% base pair variation between the 3C-1 and tne 
21 kb KS5 fragment isolated from a KS lesion. Higher 
levels of variation may be present in strains from 
other geograpnic regions or other disease conditions. 
Within tne LUR , synteny to HVS is lost at ORFs 2 and 
70 but there is concordance in all other regions 
conserved with HVS. Several conserved genes, such as 
thymidine kinase ( TK) (Cesarman et ai . . 1955. Blood 
S6, 2706-2714), TS and DKFR (which is present m HVS, 
see Aibrecht et al . , 1992, J . Virol- 6£ . 504^-5058, 
but not human herpesviruses), encode proteins that are 
auuropriate targets for existing drugs. 

Molecular mimicry by KSHV cf cell cycle regulatory and 
signaling proteins is a prominent feature of the 
virus. The KSHV genome has genes similar to ce.Iular 
complement -binding proteins (ORF 4), cytokines ( ORFs 
K2, K4 and K6 ) . a bcl-2 protein (ORF IS;, a cytokine 
transduction pathway protein (K9), an IL-5R-nke 
rrotein (ORF741 and a D- type cycim •:0RF72). 
Additional regions ceding for proteins with some 
similarity to MIP and IRF-Iike proteins are also 
present in the KSHV genome. There is a striking 
parallel between the KSHV genes that are similar to 
cellular genes and the cellular genes known tc oe 
induced by EBV infection. Cellular cyclin D, 

CD21/CR2, bcl-2, an IL-8R-like protein (ZBli: . IL-S 
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and adhesion molecules are upregulated by E3V 
infection (Birkenbach e: aJ . . 1953. J. Vlrcl . €i , 
2209-2220; Paimero et ai . , 1993, Oncogene 6. 
1049-1054; Finke en ai . , 1992, Slood 80, 459-469; 
5 *inke sn aJ., 1994, Leukemia & Lymphoma 12,. 413-419; 

Jones e: ai., 1995, J. Exper. Med. 162, 1213-1221). 
-his sugges-r that KSHV modifies the same signaling 



and regulation pathways that E3V modifies ar.er 
infection, but does 
10 from its own genome 



infection, but does so by introducing exogenous genes 



Cellular defense against virus infection commonly 
involves cell cycle shutdown, apoptosis (for review, 
see Shen and Shenk, 1995, Curr. Coin. Gener. DeveJ . 5, 
105-111- and elaboration of ceil -mediated immunity 
<CMI>. The KSKV-encoded v-bcl-2, v-cyclin and v-IL-6 
are active in preventing either apoptosis cr ceil 
cycle shutdown (Chang er a!., 1996, Nature 332, 410). 
At least one of the fi-chemokine KSHV gene products, 
20 v-MIP-I, prevents CCR5 -mediated HIV infection of 

cransfected cells. 6-chemokmes are not known to be 
reauired for successful EBV infection cf cells 
although EBV- infected 5 cells express higher levels of 
MlP-la than normal tonsillar lymphocytes (Harris et 
25 ai., 1993- 151. 5975-5983!. The autocrine dependence 

of EBV- infected 3 cells on small and uncharacteri zee 
protein factors in addition to IL-6 (Tosato et ai . , 
1990, J. Virci. 64, 3033-3041) leads to speculation 
that iS-chemokmes may also play a role in the EBV life 
3 o cycle . 

KSHV has not formally been shown to be a transforming 
virus and genes similar to the major transforming 
genes cf HVS and EBV are not present in tne 3C-1 
35 strain KSHV . Nonetheless, dysreguiat ion cr ce.i 

crolif eraticn control caused by the ider.tinea 
KSHV-enccded proto- oncogenes and cytokines may 
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contribute to neoplastic expansion of virus-inf e::ec 
cells. Preliminary studies suggest that subgsnomic 
KSHV fragments can transform NIK 3T3 cells. Ir KSHV 
replication, like that of EBV, involves recombination 
cf TR units (Raab-Traub and Flynn, 1986, Cei: 47, 
863-869), a monomorphic TR hybridization pattern 
oresent in a KS lesion would indicate a clonal virus 
population in the tumor. This is consistent with KS 
being a true neoplastic proliferation arising from 
single transformed, KS-infected cell rather than KSHV 
being a "passenger virus". identification cf KSHV 
genes similar to known oncoproteins and ceil 
proliferation factors in the current study provides 
evidence that KSHV is likely to be a transforming 
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pydpp TMF.NTAT . DETAILS SECTION II: 

MOLECULAR MIMICRY Or HUMAN CYTOKINE AND CYTOKINE 
RESPONSE PATHWAY GENES BY KSHV 

r our virus genes encoding proteins similar to twc 
human macrophage inflammatory protein <KIF> 
chemokines, an IL-6 and an interferon regulatory 
factor { IRF or ICSBP) polypeptide are present m tne 
"enome of Kaposi' s sarcoma- associated herpesvirus 
KSHV) . Expression of these genes is inducible in 
nfected cell lines by phorbol esters. vIL-6 is 
unc-ionally active in 29 cell proliferation assays. 

primarily expressed ir. KSHV- infected 
emaccpcietic cells rather than KS lesions. vMIF-I 
inhibits replication of CCR5 -dependent KIV-l strains 
■7vi:rc indicating that it is functional and could 
Il\._- lbuce to interactions between these two viruses. 
MimlVry^of cell signaling proteins by KSHV may 
abrogate host cell defenses and contribute to 
v£HV- associated neoplasia . 



10 



In is 
15 h 



20 



Kaposi's sarcoma-associated herpesvirus (KSHV! is a 

- *-^-i=>ro- ro Ens--"^-Barr virus (E3V) 
25 aammaherpesvirus rebate- to &ps 

and herpesvirus saimiri (KVS) . It is present m 

nearly ail KS lesions including the various types of 

HIV-related and HIV-unrelated KS (Chang ez al . , 1594, 

Science 265, 1865-1869; Boshcff eZ al - . 1995. Lancet 

30 345, 1043-1044; Dupin st a-.. 1955. Lancez 345, 

761-762; Schallmg ez aJ., 1595, Nazure Med. 1, 

707-706). Viral DNA preferentially localizes to KS 

rumors (Boshoff et al . , 1595, Nature Med^ 1, 

1274-127S) and serologic studies show that KS.-.v is 

d, . ^ ^ vc 1 a t e d 

3b soe.i.i.-., f " 

ilntDhoprol iterative disorders frequently occurring in 
oa-ienrs with KS , such as primary effusion lymphomas 
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(PEL) , a rare E cell lymphoma, and some forms c: 
Castleman's disease are also associated with KSKV 
infection (Cesarman e: al . , 1995, New Eng. S. Med. 
332, 1186-1191; Soulier et al . . 1995. Blood 56, 
5 1276-1280). Three KSHV-encoded cytokine - like 

polypeptides and a polypeptide similar to interferon 
regulatory factor genes have now beer, identified. 
Paradoxically, while cytokine dysregulation has been 
proposed to cause Kaposi's sarcoma (Ensoli et al . . 
10 1994, Nature 371, 674-680; Miles, 1992, Cancer 

Treatment & Research 63, 129-140), in vitro spindle 
cell lines used for these studies over the past decade 
are uniformly uninfected with KSKV (Ambroziak et al . , 
Science 26S, 562-583; Lebbe et al . , 1995, Lancet 345, 
15 1180) . 

Tc identify unique genes in the KSKV genome, genomic 
sequencing (see METHODS) was performed using 
Supercos-1 and Lambda FIX II genomic libraries from 
20 BC-1, a nonHodgkxn's lymphoma ceil line staoly 

infected with both KSKV and E3V (Cesarman et al . , 
1995, Blood 66, 2706-2714). The KSHV DNA fragments 
K£3303am and KS631Bam (Chang et al . , 1994 , Science 
265, 1365-1869) were used as hybridization starting 
25 points for mapping and bi -directional sequencing. 

Open reading frame (OR?) analysis (see METHODS) of the 
Z6 cosmid sequence identified two separate coding 
regions (ORFs K4 and K6) with sequence similarity to 
£-chemokines and a third coding region i ORE K2 ) 
30 similar to human inter ieukin- 6 (huIL-6); a fourtn 

coding region (ORE K9) is present m the Z5 cosmid 
insert sequence with sequence similarity tc interferon 
regulatory factor (IRE) polypeptides (Figures 3A-3C) . 
None of these KSHV genes are similar tc other known 
35 viral genes. Parenthetically, a protein with 

conserved cysteine motifs similar to £-chemokine motn 
signatures has recently been reported in the rr.clluscum 
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contagiosa virus (MCV) genome. Neither vMI?-Z nor 
vMIP-II has significant similarity to the MCV protein. 

The cellular counterparts to these four vira. genes 
5 encode polypeptides involved in cell responses to 

infection. For example, the MIP/RANTES (macrophage 
inflammatory protein/regulated on activation, normal 
T cell expressed and secreted) family of 8 -10 kDa 
p-chemoattractant cytokines (chemokanes) play an 
.0 important role in virus infection-mediated 

inflammation (Cook et ai . , 1995, Science 269, 
1583-1585). £-chemokines are the natural ligand for 
CCR5 and can block entry of non- syncytium inducing 
!NSI'; , primary lymphocyte and macrophage- tropic H-V-. 
15 strains in vitro by binding to this HIV co-receptor 

(Cocchi en al.. 1995, Science 270, 1811-1315). IL-6, 
initially described by its effect on 3 cell 
differentiation (Hirano et al . , 1985, Proc Nazi Acad 
Sci, USA 65, 5490; Kishimcto et al . . 1955, Elcoo 66, 
20 1243-1254), has pleiotropic effects or. a wide variety 

cf cells and may play, a pathogenic role in multiple 
mveloma, multicentric Castieman's disease (a 
KSHV-reia-ed disorder}, AIDS-KS and -3V-reiatea 
postransplant iymphoproiif erat ive disease (Klein e: 
25 almi 1905, Blood 85, 863-872; Kilbert e: al . . 1995, J 

Exp Med 162, 243-248; Brandt et al., 199C, Curr Topic 
Microbiol Iirmuncl 166, 37-41 ; Leger et al . , 1991, 
Blood 78, 2923-2930; Burger et al . , 1994, Annal 
Hematol 69, 25-31; Tosato et ai . , 1993, J Clin Invest 
51, 2806-2614) . IL-6 production is maucec oy eitner 
E3V or CMV infection and is an autocrine ractor ^cr 
EBV- infected iymphoblastoid cells that enhances their 
tumorigenicity in nude mice (Tosato et al . , 1990, J 



30 



Virol 64, 3023-3041; Scala e: 



^7 i 5 5 C ■ , C Exo Med 



35 172, 61-56; Almeida et al . , 1954, Blood 53, 370-376). 

Cell lines derived from KS lesions, although not 
infected with KSHV , also produce and respond to Zh-€ 



10 



.1 D 



25 



35 



PCT/US97/I3346 

WO 98/04576 

107 

(Miles e: al - ■ 1990. Proc Nad Acad Sci US.-. 57, 
4068-4072; Vang et al . . 1994, J Immunol 152, 943-955;. 
While MI? and IL : 6 are secreted cytokines, the IFF 
family of polypeptides regulate interferon- inducible 
5 genes m response to y- or or- /0- interferon cytokines 

by binding to specific interferon consensus sequences 
; * ICS) wichin interferon- inducible promoter regions. 
A broad array of cellular responses to interferons is 
modulated by the repressor or transact ivatcr functions 
of IRF polypeptides and several members (IRF-1 and 
IRF-2) have opposing anti -oncogenic and oncogenic 
activities (Sharf et al . , 1995, J Biol Chein 270, 
13063-15069; Harada ez al., 1993, Science 259. 
971-974; Weisz et al . , 1994, Internaz Immunol 6, 
1125-1131; Weisz et al . , 1992. J Biol Cherr. 267, 
25589-25596) . 



The 28 9 bp ORF K6 ( ORF MI PI) gene encodes a 10.5 kDa 
polypeptide (vKIP-I; MIPI). having 37.9% ammo acid 
20 identity (71% similarity) to huMIP-lor and slightly 

lower similarity to other 6- chemokines (Figure 3A) . 
ORF K4 also encodes a predicted 10.5 kDa polypeptide 
ivMIP-II; vMI?la-ID with close similarity and amino 
acid hydropnooicity profile to vMI?-I. The two 
KSHV - encoded MI? B - chemokines are separated from' each 
ether or. the KSHV genome by 5.5 kb of intervening 
sequence containing at least 4 ORFs (see METHODS). 
Both polypeptides have conserved /3-chemokine motifs 
(Figure 3A, residues 17-55) which include a 
30 characteristic C-C dicysteme dimer (Figure 3A, 

residues 36-37), and have near sequence identity to 
human KIF-la at residues 56-84. However, the two 
polypeptides show only 49.0% amino acid identity to 
each' other and are markedly divergent at the 
nucleotide level indicating that this duplication is 
not a cloning artifact. The two viral polypeptides 
are more closely related to each other 
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ohylogeneticaliy than to huMIP-lor.. huMIP-U cr 
huRANTES suggesting that they arose by gene 
duplication rather than independent acquisition from 
the host genome (see Sequence alignment in METHODS) . 
5 The reason for this double gene dosage in the viral 

genome is unknown. 

The KSHV ORF K2 (Figure 3B) encodes a hypothetical 204 
residue, 23.4 kDa IL-6-like polypeptide with a 
10 hydrophobic 19 amino acid secretory signaling peptide 

having 24.8% amino acid identity and 62.2% similarity 
:o the human polypeptide. vIL-6 also has a conserved 
sequence characteristic for IL-6-like interieukins 
(amine acids 101-125 of the gapped polypeptide: as 
15 well as conserved four cysteines which are present in 

IL-6 polypeptides (gapped alignment residue positions 
72, 76, 101 and 111 in Figure 33). IL-6 is a 
glycosylated cytokine and potential N-nr.xec 
alycosylation sites in the vIL-£ sequence are present 
20 at gapped positions 96 and 107 m Figure 3C The 449 

residue KSHV vIRF polypeptide encoded by GRF K= nas 
lower overall amino acid identity (approximately 13%) 
to its human cellular counterparts than either cf the 
vMIPs or the vIL-6, but has a conserved region derived 
2 5 from the IRF family of polypeptides (Figure 3C, gapped 

• residues 86-121). This region includes the 

tryptophan -rich IRF ICS DNA binding domain although 
only two of four tryptophans thought to oe involve a m 
DNA binding are positionally conserved. It is 
30 preceded by an S7-residue hydrophiiic K- terminus witn 

little apparent IRF similarity. A low degree of amino 
acid similarity is present at the C- terminus 
corresponding to the IRF ramify 

transactivator/repressor region. 



35 



The four KSHV cell signaling pathway gene* show 
similar patterns of expression in virus - intectea 
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lymphocyte cell lines by Northern blotting se~ 
METHODS) . Whole RNA was extracted frorri BCP-1 .a :e:: 
line infected with KSKV alone) and 3C-1 (S3V and KSKV 
coinf ected , see Cesarman et ai . , 1933,- Sicca = c , 
5 2706-2714) with or without pretreatment with 2C ng/m_ 

12 -O- tetradecanoylphorbol - 13 - acetate (TPA, Sigma, St . 
Louis MO; for 48 hours. While constitutive expression 
of these genes was variable between the two cell 
lines, expression of all four gene transcripts 
increased in 3CP-1 and BC-1 ceils after TPA induction 
(Figures 4A-4D) . This pattern is consistent with 
expression occurring primarily during lytic phase 
virus replication. Examination of viral terminal 
repeat sequences of BCP-1 and BC-1 demonstrates that 
low level of virus lytic replication occurs ir. BCP-1 
but not BC-1 without TPA induction (see METHODS ) , and 
both cell lines can be induced to express lytic phase 
aenes by TPA treatment despite repression of DNA 
reciication in 3C-1. Lower level latent expressicr. is 
also likely, particularly for vIL-6 (Figure 4C; and 
vIRF (Figure 4D) , since these transcripts are 
detectable without TPA induction in BC-1 cells which 
are under tight latency control. To determine if in 
vitro K£ spindle cell cultures retain defective or 
25 oartiai virus sequences that include these genes. DNA 

was extracted from four KS spindle ceil lines (KS-2, 
KS-10. KS-13 and KS-22) and PCR amplified for vKI?-I, 
vMlP-II, vIL-6 and vIRF sequences (see METHODS) . Hone 
cf the spindle ceil DNA samples were positive for any 
3 0 of the four genes. 

vIL-S was examined in more detail using bioassays and 
antibody localization studies tc determine whether it 
is functionally conserved. Recombinant vIL-6 (rvIL-6) 
35 is specifically recognized by antipeptide antibodies 

which do net cross-react with huIL-o (Figures 5A-s3; 
(see METHODS) . vIL-6 is produced constitutive!*/ m 



20 



15 



PCT/US97/13346 

WO 98/WS76 

110 

BCP-1 cells and increases markedly after 46 hour T?.~ 
induction. consistent with Northern hybridization 
experiments. The BC-1 ceil line coinfected with both 
KSEV and E3V only shows vIL-6 polypeptide expression 
5 after TPA induction (Figure 5A, lanes 3-4) and ccr.troi 

r 3V -r/ected F3HP.1 cells are negative for v_L-€ 
expression (Figure 5A, lanes 5-6). Multiple high 
molecular weight bands present after TPA induction 
(-?-, -25 kDa! may represent precursor forms of the 
10 oolvpeptide. Despite regions of sequence 

dissimilarity between huIL-6 and vIL-G. tne virus 
interieukin 6 has biologic activity in functional 
bioassavs using the IL-S -dependent mouse plasmacytoma 
cell line ES (see METHODS) . CCS 7 supernatants from 
the forward construct (rvIL.-6) support 35 =el- 
croiiferation measured by 3 K- thymidine ^ ^?^ ke 
"indicating that vIL-6 can substitute for cellular ^-e 
in preventing E9 apcptosis (Figure 6). vIL-€ 
supported 35 obliteration is dose dependent with the 
unconcent rated supernatant from the experiment shown 
in Figure 6 having biologic activity equivalent to 
aocroximately 20 pg per ml huIL-6. 

Forty-three percent of noninduced BCP-1 ceils (Figure 
" 7 ir* have' intracellular cytoplasmic vIL-S 
inununostaining isee METHODS) suggestive^ of 
constitutive virus polypeptide expression m cuiturec 
infected cells, whereas no specific immunoreactive 
staining is present in uninfected centre 1 F3HF.1 cells 
/=ioure 7B) . vIl-6 production was rarely detecteo m 
KS tissues and only cne cf eight KS lesions examined 
showed clear, specific VIL.-6 immunostainma in less 
than 2% of cells (Figure 7C) . The specificity of .r.is 
low positivity rate was confirmed using preimmune sera 
and neutralization with excess vIL-6 peptides. Rare 
vIL-G -producing cells in the KS lesion are positive 
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r ■ Figure 



6A) , or CD4 5 , a pan- hematopoietic ce.i marKer 
8B) , demonstrating that both endothelial and 
hematopoietic cells in KS lesions produce vlL6 . It is 
possible that these rare vIL-6 positive cells are 
5 entering lytic phase replication which has been shown 

to occur using the KSHV Tl . 1 lytic phase RNA probe. 
In contrast, well over half (65%) of ascitic lymphoma 
cells pelleted from an KlV-negative PEL are strongly 
positive for vIL-6 (Figure 7E) and express the plasma 
10 C ell marker EKA (Cesartnan et aJ . , 1955, Blood 86, 

2706-2714) indicating that either most PEL ce^-s jli: 
vivo are replicating a lytic form of KSHV cr that 
latently infected PEL cells can express high levels of 
vIL-6. Ko specific staining occurred with any control 
15 ussues examined including normal skin, tonsillar 

t issue, multiple myeloma or angiosarcoma using either 
ere immune or post-immune rabbit anti-vIL-6 antiooay 
(Figure 7E and 7F) . 



20 



8C) which was free of KS microscopically, 
positive lymph node cells were present in 



Virus dissemination to nonKS tissues was found by 
examining a lymph node from a patient with AIDS - KS who 
did not develop PEL . Numerous vIL- 6 - staining 

hematopoietic cells were present m this lymph node 
{ Figure 
2 5 v I L - ~ 

relatively B-ceil rich areas and some express CD2 0 3 
cell surface antigen (Figure BD) , but not EKA surface 
antigen (unlike PEL cells) (Cesarman et al . , 1955, 
Blood 66, 2708-2714). No coiocalization of vIL-6 
30 positivity with the T ceil surface antigen DD3 or the 

macrophage antigen CD6 5 was detected, although 
phagocytosis of vIL-6 immunopes it ive cells by 
macrophages was frequently observed. 

35 To investigate whether the vMIF-I can inhibit NSI 

KIV-1 virus entry, human CD4- cat kidney ceils 
( CCC/CD4 ) were transiently transfected with plasmics 
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expressing human CCR5 and vMI?-~ or its reverse 
construct I-?IMv (see CCR5 and vMIP-I cloning in 
METHODS j . These cells were infecced witn e::r.er K-^ 
or SFI6 2 primary- NSI HIV-1 isolates which are known :c 
5 use CCR5 as a co-receptor (Clapham s: ai , , 1992. u 

Virol 6£, 3521-3537) or with the KIV-2 variant R03/B 
which can infect CD4 + CCC cells without human CCR5 . 
Virus entry and replication was assayed by 
immunostaining for retroviral antigen production 
{Figure 9) . vMIP-I cotransf ect ion reduced NSI HIV-1 
foci generation to less than half that of the 
reverse-construct negative control but had no effect 
on ROD/5 HIV-2 replication. 

Molecular piracy of host cell genes is a newly 
recognized feature of some DNA viruses, particularly 
herpesviruses and poxviruses (Murphy, 1994, Infect 
Zenzs Vis 3. 137-154; Albrecht et ai - , 1992, J Virol 
66, 504-7-5055; Gac and Murphy, 1994, J Biol Chen, 269, 
28539-28542; Chee ez al . , 1990, Curr Top Microbiol 
Immunol 154- 125-169; Massung ez al . . 1994. Vircl 201, 
215-240) . The degree to which KSHV has incorporated 
cellular aer.es into its genome is exceptional. In 
addition to vMI?-I and vMIP-II, vIL-6 and vIRF, KSHV 
also encodes polypeptides similar to bcl-2 (OR? 16), 
cyclm D ( ORF 72), complement - binding proteins 
similiar to CC21/CR2 CORF 4), an NCAM-Iike adhesion 
cro: eir. (ORF K14 j , and an IL-6 receptor (ORF 74 !■ . E3V 
also either encodes { 3HRF1 /bcl - 2 > or induces (CE-2; 
cvclir. D; IL-6 ; bcl-2; adhesion molecules ana an 
IL-5R-IiKe EBI1 pro'ein) these same cellular 
polvr>eotides (Cieary ez al . , 1986, Cell 47, 19-26; 
Tcsato ez al . . 1990, J Virol 64. 3033-3041; Palmero ez 
al., 1993, Oncogene B . 1049; Larcher ez al . , 1995, Eur 
J Immunol 25, 1713-1719; Birkenbach et al . , 1993, J 
Virol 67, 2209-2220). Thus, boch viruses may modify 
similar host cell signaling and reguiaicry pathways. 
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ESV appears :o effect these cnanges through :ndu:: 
of cellular gene expression whereas KSHV intrpc: 



-he polypeptides exogenously from its 



own aenome . 



5 Identification of these virus -encode; 

polypeptides leads to speculation about 
potential roles in protecting against cellular 
antiviral responses. huIL-6 inhibits 

7 -interferon- induced, Rax-mefliated apoptosis in 
10 myeloma cell lines ( Lichtenstem et al . , 1595, 

Cellular Immunology 162, 246-255) and vIL-6 may play 
a similar role in infected B cells. KSHV -encoded 
vIRF, vbcl-2 and v-cyciin may also interfere with 
host -cell mediated apoptosis induced by virus 
infection and v-cyclin may prevent Gl cell cycle 
arrest of infected cells. Interference with 

interferon- induced MHC antigen presentation and 
cell-mediated immune reponse (Holzinger et al . , ±9 9s> , 
Immunol Let 35, 109-117) by vIRF is also possible. 
20 The /3-cnemokine polypeptides .vMIP- 1 and vMI?_rII may 

nave agonist 'or antagonist si^ial transduction rcies . 
Their sequence conservation and duplicate gene dosage 
are indicative of a key role in KSHV replication and 
survival . 

uncontrolled ceil growth from cell - signaling pathway 
dysregulation is an obvious potential by-product of 
this virus strategy. Given the paucity of vIL-6 
expressing cells m KS lesions, it is unlikely that 
30 vIL-6 significantly contributes to KS cell neoplasia. 

KSHV induction of hu-ILc, however, with subsequent 

_ , _ x , 7 c i ] a endothelial growth 
mauction or vas--id^ _ - 

factor-mediated angiogenesis " <Holzinger e: al . , 1993, 
Immunol Let 35, 109-117;, is a possibility. vIL-6 
35 could also potentially contribute to the pathogenesis 

of KSKV-relatec lymphoprcl if erative disorders such as 
PEL or the plasma ceil variant cf Castieman's disease. 
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The oncogenic potential of cellular cyclm anc cci-- 

«= W aii -oetab 1 ' ; shed and these 
over expression is we_± escas— bnc^ 

virus-encoded polypeptides may also contribute tc 
KSKV- related neoplasia. 

c 

KSHV vKIP-I inhibits NSI HIV-1 replication m v::ro 
figure 9) . Studies from early xn the AIDS epidemic 
indicate that survival is longer for AIDS-KS patients 
chan for other AIDS patients, and that 93% of ITS AIDS 
10 pa-rents surviving >3 years had KS compared tc only 

28% of remaining AIDS patients dying within 3 years of 
diagnosis (Hardy, 1991, J AJDS 4, 386-391; Lemp et 
al.~, 1990, J Am Med Assoc 263, 4C2-406; Rothenberg er 
al.', 1967. New Eng J Med 317, 1297-13G2; Jaccbson et 
alm ' § 1993, Am J Epidemiol 138, 953-964; Lundgren et 
al / t li? o 5f Am J Epidemiol 141, 652-658). This may be 
due' to KS occuring at relatively high CD4- counts and 
high mortality for other AIDS - defining conditions. 
Recent surveillance data also indicates that the 
20 epidemiology of AIDS-KS is changing as the AIDS 

eoidemic progresses ( ibid) . 

METHODS 

25 Genomic Sequencing. Genomic inserts were randomly 

sheared, cloned into MI3mplS, and sequenced to an 
average of 12 -fold redundancy with complete 
bidirectional sequencing. The descriptive 

nomenclature of KSHV polypeptides is based on the 
30 naming system derived for herpesvirus saimiri 

{Albrecht et al., 1992, J Virol 65, 5047-50585. 

Open reading frame (OR?) analysis. Assembled sequence 
-ontigs were analyzed using MacVector {IBI-Kodak, 
35 Rochester NY) for potential open reading frames 

greater than 25 amino acid residues and analyzed using 
EIASTX and BEAUTY-BLASTX (Altschul et al . . 199C, J Mol 
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Biol 215, 403-410: Woriey ec al . . 1995- Genc-e Res 5. 
1-7-5 - 184 ; http : //dot . imgen . bcm . tmc . edu : 93 31 ''ssq- 

search/ nucleic_acid- search . html ) - Similar prr.eir.s 
aligned :o the four KSKV polypeptides i _r. :-I:rs;: 
* included (name (species, sequence bank accession 

number, smallest sum Poisson distribution probability 
score)): (1) vtfJP-I: LD73 (MIF-lor? (human, gt 127077, 
D =3.Sxe-22; J KI?-la (JZaccus, gi 790633, p=3.3xe-20;, 
ktP-Iq- (Mus, gi 127079, p=1.7xe-19), MIF-l£ gi 
1346534, P =7.8xe-18> ; (2) vMIP-II: LD7 8 (KI?-la; 
(human, gi p- / . lxe 

I27C7&. p=8.9xe-21), MlP-la (Katnus, gi 790633, 
IIi.2xe-20i, KIF-10 mus. gi 1346534, p-3.8xe-20): (3) 
'vIL-6: 26 >;Da polypeptide IIL-6i (human, gi 23S35, 
15 D -7.2xe-17), IL-5 (Wacaca, gi 5143S6, p=i . 6xe- 16 ■ ; and 

" (4) rJP.F: ICS3? (Gallus, gi662255, p=l.lxe-ll;, ICSS? 
(Mus, so D23S11. p=1.0xe-10). lymphoid specific 
interferon regulatory factor (Mus. gi =72 94 9, 
p = 2 0xe-10). ISC-F3 (Mus, gi 1263310. p-B . lxe- 10 > . IRF4 
20 "(human. gi 1272477, p-l.0xe.9i. ISGF 2 (human, sp 

300978, 3.Sxe-S;. ICSS?. (human, spQ02E56. p=2.3xe-S;. 

S-ouence alignment. Ammo acid sequences were aligned 
using COSTAL W (Thompson s: a ■ . . 1994 Wuc Acidoses 
22. 4673-46 = c' v and compared using FA"? 2.1.1- - otr ' 
rooted and unrooted coctstrap comparisons produced 
ohyioaenetic trees having ail 10C bootstrap replicates 
with viral polypeptides being less divergent from eacs 
other than from the human poiypepides . 

Northern blotting. Northern blotting was performed 
...4 R g standard conditions with random- labeled probes 
(Chang ez al . . 1994. Science 265. 1565- 1S69'. derived 
from ?CF products for the following primer sets: 
vM-^-I- 5'-AGC ATA TAA G3A ACT CG3 CGT TAC-3' (SFC ID 
NO: 4), E'-GGT A3A TAA ATC CCD CCC CTT 73- 3 ' (SEQ ID 
NO: Si; VMIP-II: ='-TGC ATC AGC TTC TTC ACC CAC- 2 ' (ScO 
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ID NO: 6; . 5 ' -TGC TGT CTC GGT TAC CAG AAA AG- 3' i SZZ ZZ 
NO:7); vIL-6: 5'-TCA CGT CGC TCT TTA CTT ATC GTG-3' 
(SEQ ID NO:B).. 5' -CGC CCT TCA GTG AGA CTT CGT AAI-3' 
(SEQ ID NO: 9); vIRF = 5 ' CTT GCG ATG AAC CAT CCA G3-3< 
5 (SEQ ID NO:10) , 5 ' -ACA ACA CCC AAT TCC CCG TC-3' -:SEQ 

ID NO: ID on total cell RNA extracted with RNAzcI 
according to manufacturer's instructions (TelTest Inc, 
Friendswood TX) and 10 ^9 of total RNA was loaded ir. 
each lane. BCP-1, BC-1 and P3HR1 were maintained ^ 
culture conditions and induced with TPA as previously 
described (Gao et al . , 1996, New Eng J Med 3 35, 
233-2411. PCR amplification for these viral genes was 
performed using the vMI?-I, vKI?-II, vIL-6, and vIRF 
cnmer sets with 3 5 amplification cycles and compared 
to dilutions of whole BC-1 DNA as a positive control 
using PCR conditions previously described (Moore and 
chang , 1995, New Eng J Med 332, 1181-1185). KS 
spindle cell line DNA used for these experiments was 
described in Dictor et al . , 1996, Am J Pathol 146, 
2009-2016- Amplif lability of DNA samples was 

confirmed using human HLxA - DQ alpha and pyruvate 
dehydrogenase primers . 

vIL-6 cloning. vIL-6 was cloned from a 695 bp 

25 polymerase chain reaction (?CR) product using the 

following primer set: 5 ' -TCA CGT CGC TIT TTA CTT ATC 

GTG-3' (SEQ ID NO : 12 ) and 5 ' - CGC CCT TCA GTG AGA CTT 

CGT AAC- 3' (SEQ ID NO:l3), amplified for 35 cycles 

using the 0.1 ug of BC-1 DNA as a template . ?CR 

- -i---»i"v c'^^d oCR 2.1 : Invitrogen , 

30 croauct was — ai.y — ^ - * " - 

San Diego CA) and an EcoRV insert was then cloned into 
the pMET7 expression vector (Ta/.eoe ez al . , 193S . Mol 
Cell Biol 6, 466-472} and transfected using 
DEAE-dextran with chloroquine into COS 7 ce.is 
35 [CRL-1651, American Type Culture Collection, Rcckvilie 

MD) . The sequence was also cloned into tne pM-i - 
vector in the reverse orientation (6-LIv- relative tc 
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-he SRa promoter as a negative control, w;:r. 
orientation and sequence fidelity of both constructs 
confirmed by bidirectional sequencing using dye -primer 
chemistry on an A3 1 377 sequenator (Applied Biosystems 
5 in:, Foster City CA) . 

ml of serum-free COS 7 supernatants were 
concentrated to 1.5 ml by ultrafiltration with a 
^entripius 10 filter (Amicon, Beverly MA) and 100 ul 



10 cf SU pernatant concentrate or 1 ug of rhulL-6 (RScD 

-Svstems. Minneapolis MN) was loaded per each lane in 
Laemmii buffer. For cell iysate immunobiottmg , 
exponential phase cells with and without 20 ng/ml T?A 
induction for 4 8 hours were pelleted and 100 uc of 

_,_ whole cell protein soiubilized in Laemmli buffer was 

loaded per lane, electrophoreses on a 15% 
SDc-po-'vacrvlamide gel and immunoblotted and developed 
usino standard conditions (Gao er al . , 199c, New Eng 
- Med 235, 233-241) with either rabbit antipeptide 

20 antibody (1:10 0-1:1000 dilution) or anti-hu!L-= (1 ug 

oer ml, R&D Systems, Minneapolis MN) . 

i line 35. E9 mouse plasmacytoma eel- line were 
maintai^ned. m Iscove's Modified Duioecco's ^Medium 
25 -IMDM) (Gibco, Gai thersburg , MD) , 1C% reta^ cair 



serum, 1% penicillin/streptomycin, it giutamme, 50 uM 
6-mercaptoethanol, and 10 ng per mi rhuII^-S ( RStD 
Systems", Minneapolis, MN) . S K- thymidine uptake was 
used to measure B9 proliferation m response tc hull-- 
30 or recombinant supernatants according to stanaard 

protocols <R*D Systems, Minneapolis, MN} . Briefly, 
serial 1:3 dilutions of hulL-S or Centripius lu 
concentrated recombinant supernatants were mcubateo 
with 2xlC< cells per well m a 96 well plate icr 24 
, 5 hours at 3 7°C with 10 ul of thymidine stock solution 

:53 U l of imCi/ml 5 H- thymidine in 1 mi IMDM) added to 
each well during the final four hours cf incubation. 
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Cells were harvested and incorporated J H-thymictne 
determined using a liquid scintillation counter. Eacr. 
data point is the average of six determinations with 
standard deviations shown. 

5 

vIL-6 immunostaining. Immunostaininc was performed 
using avidin-biotm complex (ABC) method after 
deoaraffmization of tissues and quenching for 3 0 
minutes with 0.03% H 2 0 a m PBS. The primary antibody 
10 was applied at a dilution of 1:1250 after blo ^ : ^ 

with 10% normal goat serum, 1% 3SA, 0.5% Tween 20. 
The secondary biotinylated goat anti -rabbit antibody 
(1:200 in PBS ) was applied for 30 minutes at room 
temperature followed by three 5 minute washes m ?3S . 
15 Peroxidase-linked ABC (1:100 in PBS) was applied for 

3 0 minutes followed by three 5 minute washes in ?3S . 
A diamine-benzidine (DAB) chromogen detection solution 
(0.25% DAB , 0.01% H r C : in PBS) was applied for 5 
minutes. Slides are then washed, counterstamed with 
20 Hematoxylin' and covers 1 ipped . Amino ethyl carbazole 

( AEC ) cr Vector Red staining was also used allowing 
better discrimination cf double - labeled cells with 
Fast Blue counterstainmg for some surface antigens. 
For CDS 8 , in which staining might be obscured by vIL-c 
25 cytoplasmic staining, double label immunofluorescence 

was used. Microwaved tissue sections were blocked 
with 2% human serum, 1% bovine serum albumin (3SA: in 
PBS for 3 0 minutes, incubated overnight with primary 
antibodies and developed with fluorescein- con j ugatec 
30 goat anti-rabbit IgG (1:100, Sigma, fcr vIL-6 

localization and rhodamme- conjugated horse ant i -mouse 
laG (1:100, Sigma) fcr CD68 localization for 30 
minutes. After washing, secondary antibody incubation 
was reoeated twice with washing fcr 15 minutes eacr. tc 
amplify seaming. For the remaining merirane 

antiaens. slides were developed firs; f=r vIL-6 ana 
then'then secondly with the cellular antigen, as well 
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as the reverse localization (cellular antigen anticocy 
first, ann-vIL-6 second? tc achieve optimal 
visualization and discrimination of both antigens. In 
each case, the first antibody was developed using Ar.C 
5 (Sicrma) with blocking solution preincubation -(1% 33A, 

10%~normai norse serum, 0.5* Tween 20 for 30. minutes) 
and development per manufacturer's instructions. The 
second antibody was developed using the ABC- alkaline 
Phosphatase technique with Fast Blue chromager. . Both 
10 ir.icrowaving and trypsinization resulted in poorer 

localization and specificity of vlL-€ 

Immunolocalization. In cases where this was "quired 
for optimal localization of membrane antigen, tnese 
techniques were applied after vIL-6 AEC localization^ 
Vector-Red (Vector, Burlmgame, CA) staining was used 
a "fan alternative stain to AEC to achieve optimal 
discrimination and was performed per manufacturer's 
crotoccl using the A3C-alkaline phosphatase technique. 
Cell antigen antibodies examined included CDS5 (1:800, 
20 f^om clone Kim 6). epithelial membrane antigen {SMA, 

1:5 00, Dako, Carpmteria, CA) , CD3 ■: L : 2 0 0 , DaKO), 
CC20. -1:200, Dako), OPD4 (1:I0C, Dako:. CD34 -1:15, 
Dako;, CD45 (1:400, from clone 9.4.-, 126 (1:100, 
Immunctech, Westbrcok, ME) and Leu22 (1:100, 
25 Becton-Dickmson, San Jose. CA) on tissues prepared 

according to manufacturer's instructions. Specific 
vIL-S coiocalization was only found with CD34 anc CD45 
in KS lesions, SKA in PEL, and CD20 and CD45 in lymph 
node tissues . 

Irrmunchistochemical vIL-c localization was perrcrmec 
or. exponential phase 3C?-1 cells with or without 4e 
hour 7?A incubation after embedding in 1% agar m 
salme. The percentages of positive cei.s were 
determined from cell counts of three random high power 
microscopic fields per slide. Lower percentages j: 
Ta cells stain positively for vI---= arter - 
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treatment possibly reflecting cell lysis and oeatn 
from lytic virus replication induction by T?A. 
Immunostainmg of cells and tissues was demonstrated 
to be specific by neutralization using overnight 
5 incubation of antisera with 0.1 vg/ml vIL-6 synthetic 

peptides at 4°C and by use of pre immune rabbit antisera 
run in parallel with the post immune sera for the 
tissues or cell preparations. No specific staining 
was seen after either peptide neutralization or use of 
10 creimmune sera. 

CCR5 and vMI?-I cloning. CCR5 was cloned into pRcCMV 
vector flnvitrogen) and both forward and reverse 
' orientations of the vMIP-I gene were cloned into pMET7 
15 after PGR amplification using the following primer 

pairs: 5'-AGC ATA TAA GGA ACT CGG CGT TAC-3' < SEQ ID 
NO:14), 5 ' -GGT AGA TAA ACT CCC CCC CTT TG-3' ( SEQ ID 
NO: 15). CCR5 alone and with the forward construct 
ivKIP-I"; - the reverse construct (I-PIMvj and empty 
20 oMET7 vector were transfected into CCC/CD4 cells (CCC 

cat cells stably expressing human CD4 , see McKnignt e: 
al.. 1954, Virol 201, 8-18) using Lipof ectamme 
(Gibco 1 . After 46 hours, media was removed from the 
transferred ceils and 1000 TCID 5: of SF162, K23 or 
25 ROD /B virus culture stock was added. Cells were 

washed four times after 4 hours of virus incubation 
and grown m DM EM with 5% PCS for 72 hours before 
immunostainmg for HIV-1 ?24 or HIV-2 gplO= as 
previously described. Each condition was replicated 
30 3-4 times (Figure 5) with medians and error bars 

representing the standard deviations expressed 
percentages of the CCR5 alone foci. 
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ttyppptmt^TAL HETAILS SECTION III: 

The following patents are hereby ir.: = rp2ra:e= oy 
reference to more fully describe the invention 
5 described herein: 

Fowlkes, CAR30XY TERMINAL IL-6 MUTEINS , PATENT NO. 
5 565,336, ISSUED October 15, 1996 ; 
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2. Skelly e: al . , METHOD OF MAKING CYSTEINE DEPLETE- 
^L-6 MUTE INS , PATENT NO. 5,545,537, ISSUED August 13, 

19 9 6 ; 

2. Ulrich, COMPOSITION AND METHOD FOP. TREATING 
15 INFLAMMATION, PATENT NO. 5,376,368, ISSUED December 

2^, 1994: 

4. Skelly ez al . , CYSTEINE DEPLETED IL-6 MUTE INS . 
PATENT NO. 5, 3 59, 034, ISSUED Octooer 25, 19 94; 

Williams, ULTRA PURE HUMAN I NT E RLEUK I N 6, PATENT 
NO. 5,335,834, ISSUED August 16- 1994; 

Fowlkes.. CARBOXY TERMINAL IL-6 MUTE INS , PATENT NO. 
5,3 3 5, £33. 1 5 SUED August 16, 1594; 



25 



3 0 



7. Uiricn , 
INFLAMM 
1994 ; 



COMPOSITION AND METHO- FOR T-~.EA_~N^ 
1AT I ON , PATENT NO. 5,3 00,2 92, ISSUED Apri- 05, 



c. Mikayama et al . , MODIFIED HIL-6, PA. EN . NO. 
5,264,209, ISSUED NoveTiber 23, 19 93; 

9. Park, KYPERGLYCOSYLATED CYTOKINE CONJUGATES , 

PATENT NO. 5,217,681, ISSUED June OB, 1993; 
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10. Goldberg and Faquin. INTERLEUKIN 6 TO STIMULAT: 
ERYTHROPOIETIN PRODUCTION , PATENT NO- 5. IBS, =25 
ISSUED February 23, 1993; 

11. Miles e: al . . METHOD TO TREAT . KAPOSI ' S SARCOMA 
PATENT NO. 5,470,624, ISSUED November 25. 1955; 



12. Lt and Ruben, MACROPHAGE INFLAMMATORY PROTEIN 
10 -3 AND -4 [Isolated polynucleotide encoding 

polypeptide], PATENT NO. 5,504,003, ISSUED April 0 
1956 ; 
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2 , 



13. Gewirtz, SUPPRESSION OF MEGAKARYOCYTO PO I ES I S BY 
MACROPHAGE INFLAMMATORY PROTEINS [Reducing number of 
circulating platelets in bloodstream], PATENT NO. 
E,306,709, ISSUED April 26, 1994 ; 

14. Fahey ez al . , METHOD AND AGENTS FOR PROMOTING 
WOUND HEALING , PATENT NO. 5,145,676, ISSUED September 
£ 19 9 2; 

15. Rosen e: al . , POLYNUCLEOTIDE ENCODING MACROPHAGE 
INFLAMMATORY PROTEIN GAMMA, PATENT NO. 5,5 56,76 7, 
ISSUED September 17, 19 9 6; 



Chuntharapai e: al . , ANTIBODIES TO HUMAN IL-3 
TYPE A RECEPTOR , PATENT NO. 5.543,503, ISSUED August 



06, 1995; 



17. Chuntharapai et al ANTIBODIES TO HUMAN IL-8 
TYPE E RECEPTOR [A monoclonal antibody as 

antiinflammatory agent treating an inflammatory 
disorder], PATENT NO. 5,440,021, ISSUED August 08, 
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16. Kunkel ez al., LABELLED MONOCYTE CHEMCATTF^GTAJCT 
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Table 1. KSHV Genome ORFs and their similarity cc 
genes in other herpesviruses. 



Pol Sr.arr 



Stop 



y.i 

ORF4 * 
*■ + 

ORF6 
ORF 7 
ORF8 
ORF 9 
ORF10 
ORF 11 
K2 

ORFC2 
K3 

ORF7 0 

K4 

K5 

K6 

K™ 

ORF 16 
ORF 17 
ORF 18 
ORF 15 
ORF 2 0 
ORF 21 
ORF2 2 
ORF2 3 
ORF 2 4 
ORF2 5 
ORF2 6 
ORF 2 7 
ORF2 S 
ORF2Sb 
ORF3 0 
ORF 3 1 
ORF3 2 
ORF3 3 
ORF 2 9a 
ORF 3 4 
ORF3 5 
ORF 3 5 
ORF3 7 
ORF 3 8 
ORF3 9 
ORF4 0 
ORF 41 
ORF4 2 
ORF4 3 
ORF4 4 
ORF 4 5 
ORF46 
ORF4 7 
ORF4 6 
ORF4 9 
ORF 5 0 
K8 

ORF 5 2 
ORFS 3 
ORF 5 4 
CRF5 5 
ORFES 
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974 




1142 


2794 




3210 


6611 




6628 


8715 




e€99 


11, 236 




11 . 


363 


14 ,401 




14 , 


519 


15, 775 


4- 


15 . 


790 


17 , 013 




. 27 , 


875 


27,261 




16 , 


553 


17 , 921 
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609 


18,608 




21 , 
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20 , 091 




-i -. 


, B32 


21 , 548 




2c . 
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25, 713 




27 


, 424 


27 , 13-7 




28 


. 621 


29 , 0C2 




3C 


. 145 


30, 672 




31 


.462 


30, 821 




32 


. 424 


22 , 197 




34 


, B42 


33 , 194 




35 


, 573 


34 , £11 


-»- 


2 = 


, 383 


37 , 125 



37,112 
4C . 516 

42 . ~-e 

46 , 933 
47, 672 
45 . 991 
5 0,41" 

5 0 . 623 
50 . 763 



5 1,404 
52 . 751 
54 . 67c 
54 .675 

5 3,635 
53 , 576 
5". 273 
56 , 65= 
60 , 175 

6 C , 3 0 B 
61.82- 
62 , 272 
64 . 953 
64 , 892 
63 . 576 
69 . 404 
69 , 915 
71 .381 
72 , 538 
71,734 
74 . 8 50 
77.15" 
77 .665 
7~ .667 
79,448 
79,436 



39. 305 
39, 302 
40 , 520 
4 6 , 907 
4 7 , e 5 C 
46 , 745 
49,299 
49, 362 
50, 856 
51,437 

52 , 766 
£3 . 695 

53 , 738 
55 , 656 
56 , 091 
5 7,210 
55 , 732 
58 , 573 
56 , 976 
61.681 
62 , 444 
62 ,436 
63 . 13 6 
6",25E 
5 7,353 
68.637 
65,412 
70 , 173 
71,630 

74 .625 

75 . 569 
76 , 8C2 
77,323 
7S , 623 
76 . 765 
61 , 967 



289 
55C 

1133 

695 

845 

1012 

416 

407 

2 04 

210 

333 

337 

94 

257 

95 

126 

175 

553 

257 
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320 

580 

730 

404 

752 

1376 

305 

290 

102 

351 

77 

224 

454 

312 

312 

327 

151 

444 

4B6 

61 

399 

457 

205 

278 

605 

788 

407 

255 

167 

4C2 

302 

631 

239 

121 

110 

318 

227 

B43 



HYS HVS 



FSV Name BBY ^BV 



45 
46 
74 
65 
72 
77 
50 
49 



. 3 
. 4 
. 1 
. 0 
. 5 
.6 
. 4 
. 4 



31 
34 
55 
44 
54 
£2 
26 
28 



.2 
. 0 



65.8 48.4 
79.5 66.4 



50 . 
60 . 
70 
62 
55 
50 
53 
57 
65 
80 
76 
49 
42 
4 1 

52 

63 

51 

5E 

41 

56 . 5 

60 . 0 



0 

3 

6 

8 

6 
. 9 
. 9 
. 4 
. 8 
. 9 
. 6 
. 6 

. 8 
. 1 
. 0 
. 7 
. 6 
. 9 



45 
65 
56 
73 
51 
33 
55 
74 
~5 
50 . 2 
73 . 0 
53 . 0 
47 . 3 
45.4 
46 . 5 



26 
4 2 
48 
43 . 

42 . 

32 . 
35 . 

33 . 
45 . 
65 . 
56 . 
29 . 
21 . 
17 . 
31 . 

43 . 
3 0 . 
36 
15 
42 
31 
31 

5 0 
39 
52 
28 
29 
36 

6 0 
6 1 
3 0 
59 
2 5 
24 
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5 
4 

e 
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1 
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£ 
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. 0 
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. 1 
. 4 
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BALF3 
BALF4 
BAL.F5 
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3VRF2 

BVRF1 

BXRF1 

3XL,F1 

3XLF2 

BTRF1 

BcRFl 

BcLFl 

3DLF1 

30LF2 

BDLF3 

BDRF1 

BDLF3 . 5 

EDLF4 

BGLF2 

E3LF2 

SGRF1 

BGI.F3 

BG1.F3 . 5 

SGLF4 

3GLF5 

EELF1 

E3RF3 

S3L.F2 

E3LF3 

53RF2 

33RF1 

E31>F4 

BKRF4 

EKRF3 

3KJRF4 

ERRF2 

BRRF1 

BRL.F1 
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55.6 36.0 31.RF1 

5 3.0 3 3.5 5-L.F3 

64.4 46.4 3SRF1 

62.5 44.3 BSLF1 
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ORF57 
KS 
K10 
Kll 
ORF58 
ORF5 9 
ORF6 0 
ORF6 1 
ORF62 
ORF6 3 
ORF64 
ORF6 5 
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ORF6 9 
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ORF7 3 
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ORF74 
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BFRF3 
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64 


. 7 
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BFLF1 
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BFLF2 
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32 . 
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31. 
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, 3 


BNRF1 



45 . 




22 
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2 5 


3 


50 . 
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26 . 


. 3 


74 , 
64 




5*7 
43 . 


_ 3 
. 6 


57 




34 




47 , 




24 


. 5 


46 


. 6 
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27 


. 6 


50 


. 0 


2c 


. 0 


62 . 


. 5 


2 9 


. 5 


58 . 




35 . 




60 . 




41 . 


. 7 



Name Function 

ORF4* Complement binding protein tv-CBP) 
* * 

ORF6 ssDNA binding protein (SSBP) 

ORF7 Transport protein 

ORF8 Glycoprotein 3 (gB) 

ORF5 DI5A polymerase (pol) 

ORF10 

ORF11 

K2 vIL.-© 

ORF02 DHFR 

K3 EHV4-IE1 I 

ORF7C Thvmicylate synthase (TS) 

K4 " vMIF-i: 

?;5 3HV4-IE1 II 

K6 vMIP-I 

K7 

ORF16 Bel -2 

ORF17 Caps id protein I 

ORF18 

ORF19 Tegument protein I 
ORF20 

ORF21 Thymidine kinase (TK) 
ORF22 Glycoprotein H IgH) 
ORF2 3 
ORF24 

ORF25 Major capsid protein (MCP; 
ORF26 Capsid protein II 
ORF2 7 
ORF2 8 

ORF29b Packaging protein j. j. 
ORF3 0 
ORF31 
ORF3 2 
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ransactivator (LCTP) 



12-7 



OP.F3 3 . , 

ORF2 9S PacKagir.c protein - 

ORF34 

ORF3 5 . . 

ORF36 Viral protein Kinase 

ORF3 7 Alkaline exonuciease (~-J 

OR.F3 B „ . M , 

0R 7i5 Givcoprotem M (gM) 

ORF4C Helicase-primase. supur.it 1 

ORF41 Helicase-primase, suDunit 2 

ORF4 2 -r - T 

ORF43 CaDSid protein ni 

GR tt44 v^iicase-^rimase, subunit 3 

ORF4 5 Virion assembly prctem 

OR'4S Uracil DNA giycosyiase (UDG 

ORF47 Glycoprotein L (gL) 
ORF4 8 
0RF4 9 
ORF50 

Ke 

ORF52 
ORF5 3 

ORF54 duTPase 

ORFE5 _ 

ORF 5 6 DKA replication protein I _ ^ 

ORF57 immediate-early protein II i--P-xD 

K9 vIRFl ( ICS5P ) 

K10 

Kll 

OP^SE phosphoprotein 

ORF59 DNA replication protein II 

OPF60 Ribonucleotide reauctase, smaii 

OR^61 Ribonucleotide reductase, large 

ORFS2 Assembly /DNA maturation 

ORF6 3 Tegument protein II 

ORFS4 Tegument protein III 

ORF6 5 Caps id protein IV 

ORF6 6 . 

ORF6 7 Tegument protein av 

ORF6 6 Glycoprotein 
ORF6 9 

K12 Kaposin 
K13 

OR.FT2 Tvclin D 

ORF73 Immediate -early protein {IE?: 

K14 OX -2 tv-adhJ 

ORF74 G-prctein coupleo_recept or 

ORF7 5 Tegument protein/ r GARAT 
K15 



Legend to Table 1- Name (e.g. HI or ORF 4 : rerers to 
: -n e KSHV ORF designa-r-; Pel signifies polarity of 
-he ORF within The KSHV genome; Start refers to Tne 
position of The first LUR nucleotide in the start 
codon; Step refers to the position of the last LUR 
nucleotide in the stop codon ; Size indicates the 
number cf amino acid residues enccded by tne KSHV ORF ; 
~VS%Sim indicates the percent similarity of the 
indicated KSHV ORF to the correspendmg ORF cf 
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herpesvirus saimiri ; HVS%Id indicates the percent 
identity of the indicated KSKV OR? Jic the 
corresponding OR? of herpesvirus saimiri; ^=v Name 
indicates the EBV ORF designation; E3V»Sim indicates 
5 the percent similarity of the indicated KSHV OR.- to 
the named Epstein-Barr virus ORF; E3V%Id indicates the 
percent identity of the indicated KSHV ORF tc the 
named Epstein-Barr virus ORF. The asterisks in the 
KSHV Name column indicate comparison of KSHV ORF4 to 
10 HVS ORF4a (♦) and HVS ORF4b (**>. The entire 

unannotated genomic sequence is deposited in GenBank, 
under the accession numbers: U75698 (LOR J . U75699 
(terminal repeat), and U75700 (incomplete terminal 
repeat). The sequence of the LOR (U756S8) is also set 
15 forth in its entirety in the Sequence Listing below. 

Specifically, the sequence of the LUR is set forth in 
5 *. to 3' order ir. SEQ ID Nos:17-20. More 
specifically, nucleotides 1-35,100 of the LUR are set 
-o-th ir. SEQ ID NO: 17 numbered nucleotides 1-35,100, 
20 respectively; nucleotides 3 5,101-70,200 of the LUR 

are set forth in SEQ NO: 18 numbered nucleotides 1- 
25(1O0 , respectively; nucleotides 70.201-105,300 of 
the" L-JF. are set forth in SEQ " NO: 15 numbered 
nucleotides 1-35.1QC. respectively; and nucleotides 
25 105,301-137,507 of the LUR are set forth in SEQ ID 

- no: 20 numbered nucleotides 1-32.207. respectively. 
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SEQUENCE LISTING 

(I) GENERAL INFORMATION: 



- APPLICANT: The Trustees of Columbia University 
{ "' ~ York 



ir ►ho 2:"'-' "f New 



f -- -ITLE O- INVENTION: UNIQUE ASSOCIATED " KAPOSI 'S SAP-COMA VIRUS 
U_: ,IT^ u. - SEQUENCES AND USES THERE Or 

iiil NUMBER OF SEQUENCES: 2 0 ■ 
CORRESPONDENCE ADDRESS: 



{i 



(A) "" ADDRESSEE : Cooper & Dunham LLF 

(B) STREET: 118 5 Avenue of the Americas 



IC) CITY : New York 
{E^ STATE: New York 

(e; country: u.s.a. 

(r) ZIP : 10036 

[-f COMPUTER READABLE FORM : 

(A, MEDIUM TYPE: Floppy disr. 



COMPUTER: I3M PC compatible 
>r', ORATING SYSTEM : PC - DOS / MS - DOS 

M\ SOFTWARE : Patent^ Release #1.0, Version ftl.30 



tvi ; 



•"RENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
( B i FILING DATE : 
(C: CLASSIFICATION: 

VTT ORNEY / AGENT I NFORMATION : 
NAME : White, John P. 



REGISTRATION NUMBER : 2 8.676 ^ 
\z\ REFERENCE /DOCKET NUMBER : 4516 S-G-PCT/ jP* 



-r^E^OMI.rJNICATION INFORMATION: 
" * "-"-^""HOIiE : !212) 278-0400 
)-'. ~~ ^AX (212 < 391-0525 



■ 2) INFO PJ-1AT ION FOR SEQ ID NO : 1 : 

ii) SEQUENCE CHARACTERISTICS: ^ 

(A) LENGTH: 335 ammo a:;as 

(B) TYPE: ammc acid 
(D) TOPOLOGY: linear 

liii MOLECULE TYPE: protein 

txi' SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 



Phe vai Pro Leu Ser Leu Tyr Val Ala Lys Lys Leu Phe 



Met Phe Pro 

i 5 x 

Arg Ala Arg Gly ~* ^ 



rg Gly Phe Arg Phe Cys Gin Lys Pro Gly Val Leu A.a Leu 
2 0 

Ala Pro Glu Val Asp Pro Cys Ser He Gla His Glu Vai Thr Gly Ala 

- C 4 0 

Glu — Pro His Glu Glu Leu Gin Tyr Leu Arg Gin Leu Arg Glu lie 
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Leu cys A rg Gly Ser Asp Arg Leu Asp Arg Thr Giy He Gly Thr Leu 

65 7 ^ 
Ser Leu Phe Gly Met Gin Ala Arg Tyr Ser Leu Arg Asp His Phe Pre 

85 90 

Leu Leu Thr Thr Lys Arg Val Phe Trp Arg Gly V.l V.l Gin Glu Leu 

100 ±v:> 
Leu Tro Phe Leu Lys Gly Ser Thr Asp Ser Arg Glu Leu Ser Arg Thr 
115 120 

Glv Val lvs He Trp Asp Lys Asn Gly Ser Arg Glu Phe Leu Ala Gly 
- 130 ' 135 140 

Arg Gly Leu Ala His Arg Arg Glu Gly Asp Leu Gly Pro Val Tyr Gly 

145 150 

Phe Gin Trp Arg His Phe Gly Ala Ala Tyr Val Asp Ala Asp Ala Asp 
16= i70 

tv- Th~ Glv Gin Gly Phe Asp Gin Leu Ser Tyr lie Val Asp Leu He 

180 185 
L ys Asr. Asr. Pre Hi. Asp Arg Arg He He Met Cys Ala Trp Asn Pro 

Ala Asp Leu Ser Leu Met Ala Leu Pro Pro Cys His Leu Leu Cys Gin 



210 



Phe Tyr Val Ala Asp Gly Glu Leu Ser Cys Gin Leu Tyr Gin Arg Ser 

Gly Asp Mar Glv Leu Gly Val Pro Phe Asn He Ala Ser Tyr Ser Leu 

leu m~- Leu Ala His Val Thr Gly Leu Arg Pro Gly Glu Phe 

LSU 26"0 265 270 

Tie His Thr Leu Gly Asp Ala His lie Tyr Lys Thr His He Glu Pro 
215 280 

Leu Arc Leu Gin Leu Thr Arg Thr Pro Arg Pro Phe Pro Arg Leu Glu 
290 295 300 

He Leu Arg Ser Val Ser Ser Me, Glu Glu Phe Thr Pro Asp Asp Phe 
305 310 3.5 

Arg Leu Val Asp Tyr Cys Pro His Pro Thr He Arg Met Glu Met Ala 



Val 



(2) INFORMATION FOR SEQ ZD NO : 2 : 

ii) SEQUENCE CHARACTERISTICS: 

(A; LENGTH: 10 ammo acids 
(Bi TYPE: amino acid 
(D l TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ZD NO: 2: 

Thr His Tyr Ser Pro Pro Lys Phe Asp Arg 



PCm'S9T/13346 

WO 98/04576 

131 



{!■ INFORMATION FOE SEC ID NO : 3 : 

i > SEQUENCE CHARACTERISTICS : 

(A; LENGTH: 10 ammo acids 
(E' TYPE: amino acid 
(D> TOPOLOGY: linear 

iii) MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Pre As? Val Tnr Pre Asp Val His Asp Arg 
1 5 



• 2 : 



INFORMATION FOR SEC -D NO : 4 



l 1 : 



(XI 



c-QUENCE CHARACTERISTICS: 
(A, LENGTH: 2 4 base pairs 
(B> TYPE: nucleic acid 
(C) STRANDEDNESS : single 
ID- TOPOLOGY : linear 

MOLECULE TYPE: DNA (genomic: 

H YPOTHET I CAL : N 

ANTI -SENSE: N 

SEQUENCE DESCRIPTION: SEC -D NO : 4 



AGCATATAAG GAACTCGGCG TTAC 



{2) INFORMATION FOR SEC 



NO : 5 : 



sequence characteristics: 

(A) LENGTH: 23 base pairs 

{=; TYPE: nuclei- acid 

'C* STRANDE-NESS : single 

; D : TOPOLOGY: linear 

MOLECULE TYPE : DN 

( ::: HYP CTKET I CAL : N 

Aim -SENSE: N 



ECUENCE DESCRIPTION: SEC ZD NO : 5 : 



txi'.- S 

G G TAG AT AAA TCCCCCCCCT TTG 



■2) INFORMATION FOR SEC " 1-0 : c : 

v- : SEQUENCE CHARACTERISTICS: 
(A! LENGTH: II base pairs 
{5} TYPE: nucleic acid 
( c ! STRANDEDNESS ■. s ingle 
;d: TOPOLOGY: linear 



; i i : MOLE CULE TY? E : DNA { genor.: c 1 
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iiii: HYPOTHETICAL: K 
(iv ANTI -SENSE: N 

txi! SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
TGCATCAGCT TCTTCACCCA G 

INFORMATION FOR SEQ ID NO : 7 : 
(i* SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDENESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(;:;) HYPOTHETICAL: N 
(iv) ANTI - SENSE : N 

ixii SEQUENCE DESCRIPTION : SEQ ID NO 
TGCTGTCTCG GTTACCAGAA AAG 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 24 base pairs 
IB) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
[ D > TOPOLOGY: linear 

(iii MOLECULE TYPE : DNA (genomic) 

I i i i ■ HYPOTHETI CAL : N 

(iv. ANTI - SENSE : N 

I xi; SEQUENCE DESCRIPTION : SEQ ID MO : 6 
TCACGTC3CT CTTTACTTAT CGTG 

(2) INFORMATION FOR SEQ IE NO : 5 : 

(i- SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

IC) STRANDEDNESS : single 

ID) TOPOLOGY: linear 

(ii; MOLECULE TYPE: DNA (genomic) 
(iii= HYPOTHETICAL: N 
livj ANTI -SENSE: N 

(xi; SEQUENCE DESCRIPTION: SEQ ID NO: 
CGCrCTTCAG TGAGACTTCG TAAC 

(2) INFORMATION FOR SEQ ID NO: 10: 
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■ - SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii MOLECULE TYPE: DMA (genomic: 
( ^ ^ 1 HYPOTKET I CAL : N 
(iv ANTI- SENSE: N 

SEQUENCE DESCRIPTION: SEQ ID NO : 1 

CTTGCGATC-A AC CAT CC AGG 



(2- INFORMATION FOR SEQ ID NO: 11: 

<- SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2 0 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(D: TOPOLOGY: linear 

iii- MOLECULE TYPE : DWA (genomic) 

{in' HYPOTKET I CAL : N 

(iv ANTI - SENSE : N 

(mi- SEQUENCE DESCRIPTION: SEQ ID NO : 1 
ACAACACICA ATTCCCCGTC 



■2) INFCRMATIDN FOR SEC ID NO: 12: 

SEQUENCE CHARACTERISTICS: 
(A/ LENGTH: 2 4 base pairs 
(2; TYPE: nucleic acid 
(C; STRANDEDNESS : single 
(DJ TOPOLOGY : linear 

iii MOLECULE TYPE: DNA tgencmi z . 

( l l l ; HYPOTHEC I CAL : N 

(:v: ANT I - SENSE : N 

(Ml SEQUENCE DESCRIPTION: SEC 10 NO: 
TCACGTC3CT CTTTACTTAT CGTG 



-;Z! INFORMATION FOR SEQ ID NO: 13: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
(CI STRANDEDNESS : single 
(D) TOPOLOGY : linear 

;.- , MOLECULE TYPE: DNA (genomic; 



24 



PCT/US97/13346 

WO 98/04576 

134 

( t_v ) ANTI- SENSE: N 

(xi) SEQUENCE DESCRIPTION: SEC ID NO:13: 
CGCCCTTCAG TGAGACTTCG TAAC 

(2) INFORMATION FOR SEC ID NO: 14: 

'i! SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
!C) STRANDEDNESS : single 
(D/ TOPOLOGY: linear 

(ii; MOLECULE TYPE: DNA (genomic) 

(iiii HYPOTHETICAL: N 

iivi ANT I - SENSE : N 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 
AGCATATAAG GAACTCGGCG TTAC 

(2 1 INFORMATION FOR SEQ ID NO: 15: 

(i l SEQUENCE CHARACTERISTICS: 
(Ai LENGTH: 2 3 base pairs 
(B) TYPE: nucleic acid 
(Z) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE : DNA (genomic) 

(in! HYPOTHET I CAL : N 

(iv; ANT I- SENSE: N 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GGTAGATAAA CTCCCCCCCT TTG 

[2) INFORMATION FOR SEC ID NO: 16: 

(ii SEQUENCE CHARACTERISTICS: 

{A} LENGTH: 8 01 base pairs 
IB) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
{ D ) TOPOLOGY: linear 

(ii? MOLECULE TYPE: DNA (genome; 

HYPOTHETICAL : N 

liv) ANT I- SENSE: N 

(xi) SEQUENCE DESCRIPTION: SEC ID NO: 16: 
CGTGAACACC CCGCGCCCCG CGCCCCCCAC ACCGCGCCGC CCCTCCCCCT CCCICCGCTC 
GCCTCCCGGC GCTGCCGCCA GGCCCCGGCC GGAGCCGGCC GCCCGCGGGG GGC?,GG3CGZ 12 0 



60 



GCCCGGCGGC TCCCTCGCC 



GGCGGGGGAC GG3GGAGGGG GGCGCCG^^w ■ «^o--o- -.6 0 
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CGCGGCAGCG GAGCGCGAGG GCCCCCGCC3 GCCGCCAGCG GCGGC3CAGG ZZZZGGGGZ Z -4 Z 
CCGAGCCCCG AGCGGGGCCG GGGTACGGGG CTAGGCCACG AATAATT777 TTTTCGGGCG 
GCCCCCCGAA CCTCTCTCGG CCCCCCGGTC CCCGCGGCCC GCGCGCGCCC CC CCGGGGGG 
GTAAAACAGG GGG3GGGGGA TGCGGCCGCG GCGGCGCCCG CGGCGGCGGC GGCGCTTGC7 
TTCGTTTTCT ZZZGZGGZZZ CCCGGGCGCG AGCCGCGCGG ZGGZGGZGGG ZGZZZZZTZZ 
CCCGGGGGGC 7CGGCG3GGG GCCCCCTGTC CCCGCGCGGG CCCGCGACCC CCGGCGCCGC 
CGCGCCCC3A TCCCGCGGGC GCCCCGCCCC CCTGCCGGGG ACGCCGCCGG GCCTGCGGCG 60C 
cctcccgccc GGGCATGGGG CCGCGCGCCG CCTCAGGGCC CGGCGCGGCC GGCGCCTGG7 
CCZCGCCZZC GCCCGCGGGG GAACCCGGGC AGCGAGGGAA GGGGGZGZZZ TCTCTCTACT 
GTGCGAGGAG TCTGGGCTGC TGTGTGTGAG CCTGTTTGGG GGAGCCTCCT CAGTGCTTG2 780 

801 

TACGTGGAGC CCTGGACACT A 



;2; INFORMATION FOR SEC ' ID NO: 17: 



f-"- S~OUENCE CHARACTERISTICS : 

(A) LENGTH: 35100 base pairs 
(5) TYPE: nucleic acid 
( C l ST HANDEDNESS : double 
(DJ TOPOLOGY: linear 

di' MOLECULE TYPE : DNA I g e n omi C ') 
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>xt) SEQUENCE DESCRIPTION: SEC ID NO: 17: 
TACTAATTTT CAAAGGCGGG GTTCTGCCAG GCATAGTCTT TTTTTCTGGC GG CCCTTGTG 6 0 

TAAAC CTGTC TTTCAGACCT TGTTGGACAT CCTGTACAAT CAAGATGTTC CTGTATGTTG 12 0 

TCTGCAGTCT GGCGGTTTGC TTTCGAGGAC TATTAAG Z CT TTCTCTGCTA T2GTCTCCAA 
AZTTGTGZ ^ TGGAGTGATT TCAACGCCTT ACAC3TTGAC ZZGTZTGZZZ AATGCATCCT 
TGCCAATATC CTGGTATTGC AACAATACTC GGCTTTTGCG ACTGACGGAG AGAAGAGTCA 
TTCTTGACAC CATTGCCTGC AATTTTACTT GTGTGGAACA ATCTGGGCAT CGA3AGAGCA 
TTTGGATTAC ATGGCGTGCA CAACCTGTCT TACAAACCTT GTGTGCACA3 C-~ 
CAGTCACTTG TGGTCAGCAT GTTACTTTGT ATTGTTCTAC ZT ZTG G AAAT AATGTTACCG 
TTTGGCATCT AC CAAACGGA CGAAATGAAA CCGTGTCACA AA CT AAAT A C TATAATTTTA 54 C 

CGCTGATGAG CCAAACTGAG GGGTGTTATA CT T G TT C T AA CG3GCTGTCG TCTCGCCT37 6 00 

CAAATCGTAT ATGTTTTTGG GCGCGTTGTG CCAATATAAC TC CAGAAACT CATACTGTAT 66C 
CTGTCAGCAG TACTACAGGC TTTAGAACAT TGAGTACTAA 7AGCTTAGTG AA3ATAATCC 72 0 

AT3CAACCAC ACGTGATGTA GTTGTAGTGA AAGAAGCAAA AT CT AC A CAT TTT CATATTG 7B0 
AAGTGCATTT TCTTGTATTT ATGACACTCG TAG CTCTGAT AGGAAC CATG T3T33TATCT 
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TAGGAACTAT TATCTTTGCC CATTGTCAAA AAGAACGTGA CTCAAACAAA ACAGTGCCAC 
AACAATTGCA G G ATT ATT AT TCCCTACACG ATTTGTGCAC GGAAGACTAT ACGCAACCAG 
TGGATTGGTA CTGACATTCA GGTAAGATAA TCTAAATATT CTCTATAACA TAATTGTAA7 102 C 

G TGTTTT ATG TTTATAGCTA CAAATGTTTT ATG CAAAATA CATTTTATGA GGTCGGATA2 10BC 
TTATTAAAAG CATTGTCTTA AGTACATTAA AAGGACATTG TATAACCGTG CTACTTACAG 
CATGGCC7TT TTAAGACAAA CACTGTGGAT TTTATGGACA TTTACCATGG TTATTGGCCA 
GGACAATGAA AAGTGT7CCC AAAAAACCTT AATTGGATAT AGACTTAAAA TGTCTCGTGA 
CGGTGACATT GCAGTTGGAG AAACAGTGGA ATTACGTTGT AGATCTGGAT ACACTACTTA 
TGCCCGCAAT ATAACAG CAA CATGTTTACA AGGTGGGACG TGGTCTGAAC CAACGGCAAC 
ATGTAACAAA AAGTCGTGTC CAAACCCAGG TGAAATACAA AATGGAAAGG TTATATTTCA 
TGGTGGACAA GATGC 7T7AA AATATGGGGC AAACATTTCA TATGTTTGTA ATGAAGGATA 
TTTTTTGGTT GG7CGAGAA7 ACGTGCGATA TTGTATGATT GGAG CATCTG G C C AAATGG Z 
G7GGTCA7GT TCTCTTGCTT TTTGTGAAAA AGAAAAGTGT CACAGACCGA AAA7CAAAAA 
TGGAGATTT7 AAGCZ7GA7A AAGA7TATTA TGAG7ATAAT GATGCAGTTC A7TTTGAA7G 
TAA7GAAGGA TATA7TC7AG TTGGACCACA 77CCATTGCA TGTGCAGTTA ATAACACGTG 174 0 

GACA7C7AAC ATG Z CAAC77 GTGAAG7CGC AG G C TGT AAA TTTCCATCGG TGAGTCA7GG 
77ATCCAA7C CAAGG77777 C7G77AC77A 7 AAA CAT AAG CAAAG7G77A C7777GCA7G 
CAATGATGGA 777G77C7CA GAGGATCCCC CACAATTACG TGTAACG77A CTGAATGGGA 
CCCACCAC77 CC7AAG7G73 7777GGAAGA 7A7AGATGA7 CCAAACAA77 CAAA7CC7GG 
ACG77TGCA7 CCAACACCCA ATGAAAAACC AAATGGTAAT G7C7TTCAAC GCTCAAACTA 2 04 0 

TACAGAACC7 CCAACAAAGC C7GAAGACAC C CAT A C AG CA GC7AC7TGTG ATACCAACTG 
TGAACAGCCA CCTAAAATCC 7GCCAACATC CGAAGGT7T7 AATGAGACTA CCACATC7AA 
TACAATTACA AAAGAA77AG AGGA7GAGAA AAC7A7A7CC CAGCCAAA7A CACA7A77AC 
ATCTGCCTTA ACA7CCATGA AAG CGAAAGG 7AAC777ACC AACAAGACCA ATAACTC7AC 228 0 

TGATCTACAT A7AGCGTCTA CACCCACTTC CC AAG ATG AT GCTACGCC7T CAA7ACC7AG 
TGTACAGACA CCCAATTA7A A7AC7AACGC ACCGACACG7 ACAC7AACG7 C7CTCCATA7 
7GAAGAAGGC CCA7CGAA77 CTACTAC7TC AGAAAAGGCC AC77CC7C7A CTC777CACA 24 6 0 

CAAC7CACAC AAAAATGACA CCGGAGG CA7 A7ACACAACA 77AAACAAAA CAACACAGTT 2 520 

GCCA7CCAC7 AATAAACCTA CAAACAGTCA AGCCAAGAGT TCCAC7AAGC CACGCGTTGA 2 58 0 

GACACACAA7 AAAACAACCA G7AA7CCTGC CA77TCTTTA ACAGATTC7G CAGATGTGCC 2 64 C 

TCAG ™ CCG cgagAACCAA CAG7CCCTCC CA7TTTCAGG CCACCGGCGT C7AAAAA7CG 
CTATC7GGAA AAG CAAC7AG T7A7TGGAC7 ACTAACCGC7 GTCGCCCTAA C3TGTGGACT 
GATTACC7TA TTTCACTATC TGTTCTTTCG 7TAGCCTAGA ACTTGC7CCA GTG77AGACA 
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GCAGCAGC37 GACACTACTA ATGTAACC7A AAAAATGTG3 ATG7G3TAT3 7A7TGTACTA 
AAGATACCGA CCAA7ACAAG ACAACTAATA TTAACCATAG 7GTGCG7T73 7TT37A7AAA 
ATACGCG7GT GGGAAAGCGA CAGAAGGGGG CGGCGTTTCC A7A7GAGGCC AAGTGCATT3 
GGTA7TTTAG GGGCGGTGAC CACGCACTA7 AGTGCGCGGT G73GCAGAAA AT7GACA"3 
7ATATAAACA AGGAAAGGGG ACTGTGCGCG CTTAAGCGCC AAGC3AT7A7 A3ACACGGG7 
7TT7TGTTGT 3TTGGCCAAT CGTGTCTCCA TGGCGC7AAA GGGACCACAA ACCCTCGAGG 
AAAATATTGG G7CTGCGGCC CCCACTGGTC CCTGCGGGTA CGTCTATGCC TATCTGA—C 
ACAA3TTCCC CATAGGGGAA GC3TCCCTGC TGGGCAATGG C7A3CCGGAG G3AAAAG7AT 
TTTGACTACC T CT7T7G CAC GGGCTCACAG TGGAATCCGA TTTCCCCTTA AACGTAAAGG 
CGGTGCACAA GAAAATCGAT GCAACCACAG CTTCTGTGAA ATTAACTTCA 7ACCACAGGG 
AGGCCATCG7 37T7GATAA7 ACTCACTTAT TTCAGCCAAT CTTTCAAGGA AAGGGACTGG 
AAAAGTTATG 7CGAGAGAGC CGAGAGCTGT TTGGATTTTC AACGTTTGT7 GAGCAAGAAC 
ACAAAGGGA3 G3T3TGGAGC CCAGAGGCAT GCCCTCAGCT A3C3TGCG-G AA7GA3ATT7 
TTA7GGCGG7 CATAG7TACA GAGGGATTCA AGGAGAGACT G7ACGGCGGC AAA37GGTGC 
CCGTGCCC7C TCAGACAACG CCCGTACACA TTGGGGAACA CCAG3CG7TC AAGA7A"CT 
TGTATGACGA GGATCTGTTT GGTCCAAGTC GCGCCCAAGA ACTATGTAGG T7TTACAACC 
C C G AT AT C AG TAGA7ACC7A CATGACTCCA TA77CAG7GG AATAGGACAG GC737AAGGG 
TAAAGGACG7 TAGCACGGTC A7CCAAGCGT CAGAAAGGCA A7TTG7GCA7 GACCAA7AGA 

agata:caaa gctggtccaa gccaaggact tcccccagtg tgcttccagg ggaac-gaog 

GG7CTA=C=T AATGGTGATA GACAGTCTGG TG3C7GAAC7 TGGTATGAG7 7A7GGT37G7 
CCT _ T ._ T3A 5GGA= CCCAG GATAG CTGCG AGG77CTAAA 77A7GACACG 7GGC3CA7C7 
TTGAAAACT5 CGAGACGCCA . GATGCGCGCC T7CGTGCACT AGAAGT7TG3 3ACGCAGA3C 
AGGCC77GGA TA773GCGCC CAGC737T7G CG3C3AA37C 7373CT37AG CT3ACCA3AG 
TGGCAAAGC7 GCC7CAGAAG AATCAGAGAG G AG ACG C CAA 3A73TACAAC 7CAT7G7ACC 
TACAGCATG3 CC7GGGA7AC CTC7CAGAGG CAACAG7AAA GGAAAA7GGA G—CTGCCT 
T3AAGGGCGT GC3AGTGTCT GCACTGGATG GGTCATCTTA CACCC7C7AC- CACC.oo— 7 
ACGCG7CCTC T7TCTCCCCA CAT3TCCTGG CAAGGA7G7G 77AGTAT3T3 CAGT7C77GC 
CCCACCATAA AAACACCAAC AGTCAGTCA7 ACAA7G7GG7 G3AC7A3G73 GS=A=C3=GG 
CACC7AG7CA AA7GTGTGAC C7G73TCAGG GGCAATGTCC A3r7G7A733 AT3AA3A7G3 
TST __ TA -. G gatGAAGGAC AGG7TCCCAC C7GT7C7G7C AAACG77AAG AGAGA3CCA7 468 0 

A7GTGATCAC GGGCACAGCG GGAAC37ACA ATGACCTAGA GA77CTCGGA AA3TT733CA 4 74 0 

CCT7CAGG3A GAGAGAGGAG GAGGGGAA7C C7G7GGAA3A 73373CAAA3 7A7A=A7ATT 4 800 

GGCAA37ATG CCAGAATATA ACCGAGAAGC 7AG3G7CCA7 G33CATCT3G GAGGGC3333 4 86 0 

( r~ T ^~r r-r.-'-i s- ^AAAGTGTTC AA 3 3 33 AT AG 4S2C 

ATGCCCTAAG AACCCTCATT G:Go^CAT.C C-.Aow..^. 
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ACAGCAC33T AGAGGCAGAG CTCCTAAAGT TTATTAACTG CATGATCAAA AACAATTAGA 
ACTTCAGAGA GAACATCAAA TCCGTCCATC ACATCCTTCA GTTTGCATGC AACGTATACT 
GGCAGGCGCC GTGCCCGGTT TTTCTGACCC TTTACTACAA GTCACTGCTG ACGGTCATAC 
AG G AC AT AT G TCTGACGTCA TGTATGATGT ACGAG CAGG A CAACCCGGCC GTGGGAATTG 
TACCATCCGA GTGGCTTAAA ATGCACTTTC AGACAATGTG GACCAACTTC AAGGGTGCCT 
GCTTCGACAA AGGAGCAATC ACGGGCGGGG AACTAAAAAT AGTCCACCAG TCCATGTTCT 
GTGACCTCTT TGACACCGAC GCTGCCATAG GAGGGATGTT TGCACCCGCT CGGATGCAGG 
TCAGGATAGC C AG AG C AATG CTCATGGTTC CAAAAACCAT AAAAATAAAA AACAGGATCA 
TCTTTTCCAA CTCCACCGGA GCAGAGTCGA T C CAGG CAGG TTTTATGAAG CCGGCCAGCC 
AAAGGGATTC ATACATCGTC GGAGGACCCT ACATGAAATT CCTAAACGCC CTGCACAAAA 
CACTTTTTCC TTCCACAAAA ACTTCTGC CC TGTACTTGTG GCATAAGATT GGCCAGACCA 
CAAAAAATGC CATACTACCA GGTGTCTCGG GGGAACACCT AACGG AG TT A TGTAATTATG 
TAAAGGCAAG TAGCCAGGCT TTCGAAGAGA TAAATGTTTT GGACCTTGTG CCAGACACCC 
TGACATCATA TGCGAAAATA AAACTAAACA GTTCCATTCT CCGGGCTTGC GGACAGACAC 
AGTTTTATG Z AACTACTCTC TCTTGCCTTT CGCCAGTGAC TCAGCTGGTT CCGGCCGAGG 
AGTACCCCCA CGTACTGGGG CCAGTGGGGT TGTCATCTCC AGATGAATAC AGGGCAAAAG 
TCGCCGGCAG GTCTGTAACC ATTGTACAGT CAACACTGAA GCAAGCTGTT TCCACCAACG 5 94 0 

GACGACTCCG G C CT AT C ATT ACCGTGCCAC TGGTGGTCAA CAAATATACA GGGAGCAACG 5 000 

GGAACACAAA CGTCTTTCAC TGTGCAAACC TGGGATACTT CTCGGGGAGA GGGGTGGACA 
GAAATCTCAG GCCAGAAAGC GTCCCCTTTA AAAAGAATAA TGTCAGCTCT ATGCTAAGAA 
AACGCCACGT GATTATGACC CCCCTGGTAG ACAG3CTGGT AAAGAGAATA GTTGGCATCA 
ACT CTGGG G A ATTCGAGGCA GAAGCGGTTA AGAGAAGTGT GCAGAATGTC CTGGAAGACA 
GAGATAACCC AAACCT3CCG AAGACAGTTG TATTAGAGTT GGTTAAGCCA CCTCGGTGGA 
GCTCCTGTGC AAGTCTCACA GAGGAGGACG TGATTTACTA CCTGGGC 

TTGGGGACGA GGTCCTGTCA TTACTGAGCA CAGTGGGC CA GGCGGGGGTG CCATGGACGC- 
CCGAGGGTGT GGCCTCGGTC AT C CAGG AC A TAATAGATGA TTGCGAGTTA CAGTTTGTGG 
GCCCAGAAGA GCCTTGCCTT ATCCAAGGAC AGTCGGTAGT GGAGGAGCTT TTTCCGTCCC 
CGGGCGTCCC AAGCCTGACA GTGGGTAAAA AACGAAAAAT CGCATCCCTG CTCTCTGACC 
TGGATTTGTA GTTGTGTACC CGTAACGATG GCAAAGGAAC TGGCGGCGGT CTATGCCGAT 
GTGTCAGCCC TAGC CATGGA CCTCTGTCTT CTTAGTTACG CAGACCCGGC AACACTGGAC 
ACTAAAAGTC TGGCCCTCAC TACAGGGAAG TTTCAGAGCC TTCACGGCAC ACTACTCCCC 
CTCCTCAGAC GACAAAACGC ACACGAATGC TCAGGTCTGT CACTAGAATT GGAGCACTTT 
TGGAAAACGT GGCT3ATGCT CTGGCCACGT TGGGAGTGTG C ACT AG "AG A AAACTGTCTT 
CAGAAGAG C A TTTTTCCCTC CTGCATTTGG ACACAACATG CAACAAGCAA CCGGAGCGTT 
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AGGT~TAAT7 TTTACGGAAA TTGGGCCTTG GAGT7AAAGC TG T C ACT AAT AAACGACGTT 
GAAATTTTCT TTAAACGTCT TAG7AGCGTT TTTTATTGTA TAGGATCGGG CAGTGCTCTG 
GAGGG^TAG GGGAGGTATT GCGTTTCGTT GGGAAGCTGA GGGGTATCTC ACCCGTACC7 
GGGCCGGACC TATATGTCTC AAATCTGCCC TG C CTAGAAT GCCTTCAGGA AGTGTGTCTG 
ACTCCCAACC AGGGCACCAG TCTGCAGGCC ATGCTCCCAG ACACGGCCTG CAGTCACATA 
TGTACCCCCG CATGCGGTGA GCCTGTCCSG GGCCTCTTTG AGAACGAGCT AAAACAGCC 
GGGCTTCAAA CCCCTGAGTC CATACCTACT ACCCCCTGTC AGTCCCGGGT AAGG CAAGAT 
GATGAAATCA GACAGAGC7C TCTAATGGCG GTAGGAGATC ACCACATTT7 CGGAGAGGTG 
ACCAGATCTG TCCTG3AAAT CTCAAACCTG ATCTATTGGA GCTCTGGCCA CTCGGATGCC 
AC CTGCGACG GAGACAGAGA CTGCTCTCAC CTGGCCTCGC TGTTTACTCA CGAGGCTGAC 
ATG C AT AAAA GGCGCGTCGA CCTGGCCGGA TGCTTGGGCG AACGCGGCAC GCCCAAACAC 
_ TTTTGACT G CTTTCGCCC AGACTCCCTA GAAACCCTTT TCTGTGGTGG TCTTTTTAGC 
TCCGTGGAG3 ACACCATAGA AAGTCTCCAA AAGGACTGCT CTTCTGCCTT CT AC C AAC AG 
GTAAACTACA CTACTGCACT GCAAAAACAG AACGAGTTTT ACGTCCGACT CAGCAAACTG 
CTGGCAGCTG GTCAGCTAAA TTTGGGCAAA TGTTCCACTG AAAGTTGCCA ATCCGAGGCC 
CGTAGGCAGC TGGTAGGTGG GGGCAAACCA GAGGAAGTGC TGAGGGATGC AAAACACCGG 
CAAGAACTAT ACC77CAGAA AGTGGCACGC GACGGTTTTA AAAAACTCTC TGATTGTATA 
AGACACCAGG GCCACATCC7 GTCTCAGACC CTGGGTCTAA GACTGTGGGG GTCTGTCATC 
TACAACGAGG CATC7GCCCT ACAAAACCAC TTTTTACACA GAGCACAGTT CATATCCCTC 
CCCTGGCAG3 ACCTGACGGT CGACTGTCCA ACGCGGTTTG AAAATTCTAA A7ATATCAAA 
AATTCTCT GT ACTGC2AGCG TCTGGGGCGG GAACACGTAG AGATCCTGAC ACTGGAG-.C 8220 
TACAAACTTA TCACGGGCCC GCTGTCAAAG CGACATACTT TATTTCCCAG TCCTCCAAA7 82B0 
GTGACGCTG3 CTCAGTGCTT CGAGGCTGCG GGCACGCTTC C C CAT C AAAA GATGAT3GTA 8 34 0 

TCAGAGATGA TCTGGCCCAG CAT AG AG C CG AAGGACTGGA TAG AG C C C AA CTTCAACCAG 84 00 

TTCTATAGCT TTGAG AAT C A AGACATAAAC CATCTGCAAA AG AGAG CTTG GGAATATATC 8460 
AGAGAGCTGG TATTATCGGT TTCTCTGTAC AACAGAACTT GGGAGAGGGA G CT AAAAATA 8520 
^TTCTCACCC CTCAGGGCTC ACCGGGGTTT GAGGAACCGA AACCCGCAGG ACTCACAACG E SEC 

GGGCTGTACC TAACATTTGA GACATCTGCG CCCTTGGTGT TGGTGGATAA AAAATATGGC 864C 
TGGATATTTA AAGACCTGTA CGCCCTTCTG TACCACCACC TGCAACTGAG CAACCACAAT 
GACTCCCAGG TCTAGATTGG CCACCCTGGG GACTGTCATC CTGTTGGTCT GCTTTTGCGC 
AGGCGCGGCG CACTCGAGGG GTGACACCTT TCA3ACGTCC AGTTCCCCCA CACCCCCAGG 
ATCTTCCTCT AAGGCCCCCA CCAAACCTGG TGAGGAAGCA 7CTGGTCC7A AG; 
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CAAAAAAAAC ATAGTGCCTC AT AT CTTTAA GGTGCGGCGC TATAGGAAAA TTGCCACCTC 
TGTCACGGTC TACAGGGGCT TGACAGAGTC CGCCATCACC AACAAGTATG AACTCCCGAG 
ACCCGTGCCA CTCTATGAGA TAAGCCACAT GGACAGCACC TATCAG7GCT 7TAGTTCCA7 
GAAGGTAAAT GTCAACGGGG TAGAAAACAC ATTTACTGAC AGAGACGATG TTAACACCAC 
AGTATTCCTC CAACCAGTAG AGGGG CTTAC GGATAACATT CAAAGGTACT TTAGCCAGCC 
GGTCATCTAC GCGGAACCCG GCTGGTTTCC CGGCATATAC AGAGTTAGGA CCACTGTCAA 
TTGCGAGATA GTGGACATGA TAGCCAGGTC TGCTGAACCA TACAATTACT 7TGTCACGTC 
ACTGGGTGAC ACGGTGGAAG TCTCCCCTTT TTGCTATAAC GAATCCTCAT GCAGCACAAC 
CCCCAGCAAC AAAAATG G C C TTAGCGTCCA AGTAGTTCTC AACCACACTG 7GGTCACGTA 
CTCTGACAGA GGAACCAGTC CCACTCCCCA AAACAGGATC TTTG TGG AAA CGGGAGCGTA 
CACGC7TTCG TGGGCCTCCG AGAGCAAGAC CACGGCCGTG TGTCCGCTGG CACTGTGGAA 
AACC77CCCG CGCTCCATCC AG A CT AC C C A CGAGGACAGC TTCGACTTTG TGGCCAACGA 
GATCACGGCC ACC7TCACGG CTCCTCTAAC GCCAGTGGCC AACTTTACCG ACACGTACTC 
TTGTCTGACC TCGGATATCA ACACCACGCT AAACG C C AG C AAGGCCAAAC 7GGCGAGCAC 
TCACGTCCC7 AACGGGACGG TCCAGTACTT CCACACAACA GGCGGACTCT ATTTGGTCTG 
GCAGCCCATG TCCGCGATTA ACCTGACTCA CGCTCAGGGC GACAGCGGGA ACCCCACGTC 
ATCGCCGCCC CCC7CCGCA7 CCCCCATGAC CACCTCTGCC AGCCGCAGAA AGAGACGGTC 
AGCCAGTACC GCTGCTGCCG GCGGCGGGGG GTCCACGGAC AACC7GTCTT ACACGCAGCT 
GCAGTTTGCZ TACGACAAAC TGC-3GGATGG CATTAATCAG GTGTTAGAAG AACTCTCCAG 1014C 
GGCATGGTGT CGCGAGCAGG TCAGGGACAA CCTAATGTGG TACGAGCTC^. GTAAAATCAA 1020C 
CCCC^CCAGC GTTATGACAG CCATCTACGG TCGACCTGTA TCCGCCAAGT T33TAGGAGA 
CGCCATTTCC GTGACCGAGT GCATTAACGT GGACCAGAGC TCCGTAAACA TCCACAAGAG 
CCTCAGAACC AATAGTAAGG ACGTGTGTTA CGCGCGCCCC CTGGTGACGT TTAAGTTTTT 
GAACA3TTCC AACCTATTCA CCGGCCAGCT GGGCGCGCGC AATGAGATAA TAC7GACCAA 
CAACCAGGTG GAAACCTGCA AAGACACCTG CGAACACTAC TTCATCACCC GCAACGAGAC 10500 
TCTGGTGTAT AAGGACTACG CGTACCTGCG CACTATAAAC ACCACTGACA TATCCACCCT 10 56C 
GAACACTTTT ATCGCCCTGA AT CT AT C CTT TATTCAAAAC ATAGACTTCA AGGCCATCGA 1062C 
GCTGTACAGC AGTGCAGAGA AACGACTCGC GAGTAGCGTG TTTGACCTGG AGACGATGTT 10680 
CAGGGAGTAC AACTACTACA CACATCGTCT C3CGGGTTTG CGCGAGGATC TGGACAACAC 
CATAGATATG AACAAGG AG C GCTTCGTAAG GGACTTGTCG GAGATAGTG3 CGGACCTGGG 
TGGCATCGGA AAAACGGTGG TGAACGTGGC CAGCAGCGTG GTCACTCTAT GT3GCT3ATT 
GGTTACCGGA TTCATAAATT TTATTAAACA CCCCCTAGGT GGCATGCTGA T G AT C ATT AT 
C GTT AT AG C A ATCATCCTGA TCATTTTTAT GCTCAGTCGC CGCACCAATA CCATAGCCCA 
GGCGCCGGTG AAGATGATCT ACCCCGACGT AGATCGCAGG GCACCTCCTA 3CG3CGGAGC 
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CCCAACACGG GAGGAAATCA AAAACATGCT GC7GGGAATG CACCAGC7AC AACAAGAGGA 
GAGGCAGAAG GGGGATGATC TGAAAAAAAG TACACCCTCG GTGTTTCA3C GTACCGCAAA 
CGGC7TTCG7 CAGCGTGTGA GAG G AT AT AA ACCT7TGAC7 CAATCGGTAG ACA7CAG7" 112- 

ggaaacgggg gagtgacagt ggattcgagg ttattgtttg atgtaaattt aggaaacacg 112 so 

GCCCGCCTCT GAAGCACCAC ATACAGACTG CAGTTATCAA CCCTAC7CG7 TGCACACAGA 1134C 

cIcaaattac CG7CCGCAGA TCATGGATTT TTTCAATCCA TTTATCGACC CAACTCGCGG ZliOC 

AGGCCGGAGA A^CACTGTGA GGCAACCCAC GCCGTCACAG TCGCCAACTC- TCCCCTCGGA 11460 

GAGAAGAGTA TGCAGGCT7A TAGCGGCCTG TTTCCAAACC CCGGGGCGAC CCGGCGTGGT 11520 

TGCCGTGGAC AC CACATTT Z CACCCACCTA CTTCCAGGGC CCCAAGCGGG GAGAAGTATT 115e = 

CGCGGGAGAG ACTGGGTCTA TG7GGAAAAC AAGGCGCGGA CAGGCACGCA ATGGTCC7AT 11640 

G-CGCACCTC ATAT722ACG TATACGACAT CGTGGAGACC ACCTACACGG CCGACCGCTG H700 

CGAGGACGT3 CCATT7AGCT TCCAGACTOA TATCATTCCC AGCGGCACCG 7CCTCAAGCT 11760 

GC7CGGCAGA ACAC7AGATG GCGCCAGTGT CTGCGTGAAC GTTTTCAGGC AGGGCTGCTA 11820 

CTTCTACACA CTAGCACC^C AGGGGGTAAA CGTGACCCAC GTCCTCCAGC AGGCGCTCCA 11BBC 

GGCTGGCTTC G3TC3C3CAT CCTGCGGCTT CTGCACCGAG CCGGTCAGAA AAAAAATCT7 11940 

GCGCGCGTAC GACA2ACAAC AATATGCTG7 GCAAAAAATA ACCCTGTCAT CCAGTCCGAT 12 000 

GATGCGAACG CTTAG2GACC G Z CTAACAAC CTGTGGGTGC GAGGTGTTTG AGTCCAA7GT 1206C 

GGA2GCGA7T AGGC37T7CG T3G7GGACCA CGGGT7CTCG ACA7TCGGGT GG7ACGAGTG 12120 

CAGCAA7C" G2"2C=3=A CGGAGGCCAG AGA2TCTTGG ACGGAAGTGG A37TT3ACTC- 12190 

CAGCTGGGAG GACC7AAAGT T7ATCCCGGA GAGGACGGAG TGGCCCCCAT AC7CAA7C77 12240 

iTCC _ TTGAT ATA3 ^TGTA TGGGCGAGAA GGGTTTTCCC AACGCGAGTC AAGACGAGGA 1230C 

CATGATTATA CAAA72723T GT3TTT7A2A CACAGTCGGC AACGATAAAC CGTACACGCG 12360 

CATGCTAC7G GGC2TGGG3A CATGCGACCC CC7T2CTGG3 GT33AG3TCT 7TGAG7TTCC 

TTCGGAG7AC GACATGC7GG C2GCCT7CC7 CAGCATGCTC CGCGATTACA ATGTG3A3T7 

TATAACGGGG TACAACATAG CAAAC7TTGA CCTTCGATAC ATCATAGC" GGGCAAC7CA 

, __, G - TGC A3GA-TTCAC CAAAATAAAA AC7GGG7CCG 7GT7T3AGG7 

GGTGTAC^rtC . .^nnb^lu^ HOJft 

CCACCAACCC AGAGGCGG77 CCCA7GGGGG CAAC77CA7G AGG7CCCAG7 CAAAGG7GAA 
AA7A7CGGGG A7CG7CCCCA TAGACATGTA CCAGG777GC AGGGAAAAGC 7GAG727G7C 
A3ACTA CAAG CT33 AGAGAG TGGGTAAGCA A7GCC7CGG7 CGACAAAAAG A7GACA7C72 
A7ACAAGGAC A7A2GGGCGC 77777 AAA7C 7GGGCCTGA7 GG7CGCGCAA AGG7GGGAAA 
C7AC7G7G77 A77GAC7CGG 7GC7GG77A7 GGA7C77C7G C7ACGG7772 AGACCCA7G7 
TGAGA7C7CG G AAA7AG C C A AGC7GGCCAA GATCGGCACC CG7AGGG7AC 7GAGGGACGG 
CCAACAGA7C AGGG7A777T CG7GCC7C77 GGAGGC7GC7 GC2ACGGAAG G77ACA77C 
CGCGG7GCCA AAAGGAGACG CGG77AGCGG G7A7CAGGG3 GGGAG 
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TCCGGGA7TC TATGACGACC CCGTACTCGT GGTGGATTTT GCCAGCTTG7 ACCCCAG7A7 
CATCCAAGCG CACAACTTGT GCTACTCCAC ACTGATACCC GGCG ATT CG C TCCACCTGCA 
CCCACACCTC TCCCCGGACG ACTACGAAAC CTTTGTCCTC AGCGGAGGTC CGGTCCACT7 
TGTAAAAAAA CACAAAAGGG AGTCCCTTCT TGCCAAGCTT CTGACGGTAT GGCTCGCGAA 
GAGAAAAGAA ATAAGAAAGA CCCTGGCATC ATGCACGGAC CCCGCACTGA AAACTATT C7 
AGACAAACAA CAACTGGCCA TCAAGGTTAC CTGCAACGCC GTTTACGGCT TCACGGGCG7 
TGCCTCTGGC ATACTGCCTT G C CT AAA CAT AGCGGAGACC GTGACACTAC AAGGGCGAAA 
GATGCTGGAG AGATCTCAGG C CTTTGT AG A GGCCATCTCG CCGGAACGCC TAG CGGGTCT 
CCTGCGGAGG CCAATAGACG TCTCACCCGA CGCCCGATTC AAG G T CAT AT ACGGCGACAC 
TGACTCTCTT TTCATATGCT GCATGGGTTT CAACATGGAC AGCGTGTCAG ACTTCGCGGA 
GGAGCTAGCG T C AAT C AC C A CCAACACGCT GTTTCGTAGC CCCATCAAGC TGGAGGCTGA 
AAAG AT CTT C AAGTGCCTTC TGCTCCTGAC TAAAAAGAGA TACGTGGGGG TACTCAGTGA 
CGACAAGGTT CTGATGAAGG GCGTAGACCT C ATT AG G AAA ACAGCCTGTC GTTTTGTCCA 
GG AAAAG AG C AGTCAGGTCC TGGACCTCAT ACTGCGGGAG CCGAGCGTCA AGGCCGCGGC 
C AAG CTT ATT TCGGGGCAGG CGACAGACTG GGTGTACAGG GAAGGGCTCC CAGAGGGGTT 
CG T C AAG AT A ATTCAAGTGC TCAACGCGAG CCACCGGGAA CTGTGCGAAC GCAGCGTACC 
AGT AG A C AAA CTGACGTTTA CCACCGAGCT AAGCCGCCCG CTGGCGGACT ACAAGACGCA 
AAACCTCCC3 CACCTGACCG TGTACCAAAA GCTACAAGCT AGACAGGAGG AGCTTCCACA 
GATACACGAC AGAATCCCCT ACGTGTTCGT CGACGCCCCA GGTAGCCTGC GCTCCGAGCT 1422 C 
GGCAGAGCAC CCCGAGTACG TTAAGCAGCA CGGACTGCGC G7GGCG3TGG ACCTGTACTT 14280 
CGACAAG CTG GTACACGCGG TAGCCAACAT CAT C C AATG C CTCTTCCAGA ACAACACGTC 1434 C 
GGCAACCGTA GCTATGTTGT- ATAAC7TTTT AGACATTCCC GTGACTTTTC CCACGCCCTA 14400 
GTGACTCAGA CGCGGAAACA GCGCCTAGAA AGTTTCCTCT TGCGCTATGT GGGACAACTA 14460 
GAGTCCAACC TGGCAAGCAG TGGAGCAAGA CGCCAGACAG CCGATC7CGA AAAAAATAAT 
GCAGACAGAG GCAACGTTCA TCCTAGGTGA CTGGGAGATA ACGGTGTC7A ACTGCCGGTT 
TACTTGCAGC AGCCTAACAT GTGGCCCCCT TTACAGATCT AGCGGCGACT ACACGCGGCT 
AAGAATCCCC TTCTCTCTGG AT CG ACT AAT ACGTGACCAT GCCATCTTTG GGCTAGTGCC 
AAATATTGAG GATCTGTTAA CCCATGGGTC ATGCGTCGCC GTAGTG3CCG ACGCAAACGC 
CACAGGCGGC AACGCGCGAC GCATCGTCGC GCCTGGCGTG ATAAACAATT TTTCAGAACC 
CATCGG ~ ATT TGGGTAC GCG GCCCTCCGCC GCAAACGCGC AAGGAAGCTA T7AAGTTCTG 14880 
CATATTTTTT GTCAGTCCCC TGCCCCCGCG GGAGATGACC ACATATGTGT TCAAGGGCGG 14 940 
CGATTTGCCT CCCGGAGCAG AGGAACCCGA AACACTACAC TCCGCCGAGG CACCCCTACC 
GTCGCGCGA3 ACGCTGGTAA CTGGACAGCT GCGATCCACC TCGCCGCGAA CGTATACGGG 
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GTGTGACAAC GTGGAAGGTG ACCCCGAGCA ATTGACACCC AAGTACTTGA CGTTCACGCA 
GACGGGAGAA AG ACTTTG CA AAGTAACCGT TTACAACACC CATTCGACAG CATG CAAGAA 
GGCCCGTGTT CGTT7CGTCT ACAGACCGAC GCCGTCCGCC CGTCAGCTTG 7CATGGGTCA 
GGCTTCACCC CTCATAACAA CCCCTCTGGG AGCCAGGGTA TTCGCAGTCT ATT TAG ACT 3 
TGAGAAAACT ATCCCACCTC AGGAAACCAC CACCCTGAGG ATTC AATTG C TGTTCGAGCA 
GCATGGTGCC AACGCCGGAG ACTGCGCCTT TGTCATCATG GGGCTCGCCC GTGAAACAAA 
GTTTGTCTCA TTTCCCGCAG TACTTCTTCC GGGCAAGCAC GAACACCTTA TTGTATTCAA 
CCCACAGACA CATCCTCTGA CCATTCAACG GGACACAATA GTGGGCGTGG CAATGG CTTG 
CTATATCCAC CCCGGTAAGG CAGCCAGCCA GGCACCATAC AGCTTCTACG ACTGCAAGGA 
AGAGAGCTGG CACGTGGGGC TCTTCCAGAT CAAACGCGGA CCGGGAGGGG TCTGTACACC 
ACCTTGCCAC GTAGCGATTA GGGCCGACCG CCACGAGGAA CCCATGCAAT CGTGACTGTC 
CGAGCACATA TGGCGCAGGA G 7 C AG AG GAG TGGTCCCGTG CGTTTGCAGT GT G C AG TAG T 
AAACGACAGC TCGGGCGCGG CGAGCCCGTG TGGGATTCCG TCATTCACCC GAGCCACATC "5900 
GTCATCTC7A ATCGAGTACC CCTCTTACTA AGAGAACAGC ACATATGTCT CC CTTCGTGC 
~- CAG ~ GT ,- 3 gcCAGATCCT CCACAGAGCC TACCCCAACT TTACATTTGA CAACACGCA7 
CGCAAGCAGC AAACGGAGAC CTACACTGCA TTCTACGCTT TTGGGGACCA AAAT AA CAAG 
GTTAGGA7C7 TGCCCACTGT TGTGGAAAGC TCCTCGAGCG TGCTGATTTT TAGACTGCGT 
GCATCGGTCT CTGCGAACAT CGCCGTGGGA GGG CTCAAAA TAATAATACT TGCTCTCACT 
crGGT3c;:iT 3 CCCAAGGAGT GTACCTGCGT TGCGGTAAGG ACCTTTCTAC ACCACAC7G2 
GCACCGGCTA T7GTTCAGCG TGAGGTGCTG AGCAGCGGGT TTGAGCCGCA GTTTACCG7A 
AC7GGCAT7C CAGTGACATC CTCGAACTTA AACCAATGCT ACTTTCTGGT AAGAAAGCCA 
AAAAGCCGGC TGGCAAAGCC .GTTTGCACGC CTGTCCGCGG AG AC G ATT 3 A GGA3TGTCG2 
GTCAGGTGTA TTTGTTTTGG GAAGACACAC CTGCGGATAT CGGTGACTGC GCCTGCGCAG 
GAAACGCCCG TCTGGGGGCT CGTGACCACG AGCTTCAGCC TTACCCCCAC CGTATTGTT3 
GCCTTTGATC GTAACCCGTA CAATCACGAG ACATTTGCCT GTAATGCCAA GT ACTA CATC 
CCAGT2ATCT ACAGCGGACC AAAAATTACG CTGGCCCCGC GCGGCCGCCA G3TAGTCTG3 
CACAACAACA GCTACACGTC CTCCCTGCCA TGTAAAGTCA CAGCCATCGT GTTAAACCAC 
TGCTGTAACT GTGACATATT TTTAGAGGAC TCGGAATGGC GCCCAAACAA GCCAGCACCZ 
CTGAAACTG3 TGAACACGAG TGATCATCCC GTCATATTGG AGCCGGACAC ACACATT3GA 
AACGCCCTCT TCATCATCGC ACCCAAGGCC CGAGGTTTAC GCAGACTGA2 TT3TTTAAT3 
ACAAAAACCA TTGAACTTCC TGGCGGGGTA AAGATAGACA G CAGG AAATT ATAAATATTC 
AGAAAAATGT ATGTTGCCAC CGGACGCAG7 TAGGTGTCCG GTTCCCACCC ACACATTTGT 
7TGCT TTCAAATAAA ACGGTGTTCT gtcaacctcc tccgggctca ctagtattgt 
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TAACACACAC AAAGACATAG TGACTGTAGA CAGTTAATCT TTATTGTCTA GACACGCAAA Z^ZZ 
GTATATTAGT GTTATAAGAA ATTTTATGTC ACGTCGCTCT TTACTTAT CG TGGACGTCAG 
GAGTCACGTC TGGGATAGAG TCCAAAACAC GCACCGCTTG ACCTGCAAAC TTTTCCATTG 
CACTCAGAAC ATAAAACGAA GCAAAGTGTC TCACCCAATA CTTAAGTCCC TG AAG C CT C C 174 0C 
C T AAT AG A C C GCGGTCAAAT TTGGGTGGAC TGTAGTGCGT CTTAGTCAGC TTATTGAGCT 17460 
CTTCCTGTAT GTCCCATCCT AAGGTCTTCG TCAGAAGCTC CATGACGTCC ACGTTTATCA 
CTGATTTTCC AAACTCCGTC GTTAAAAACT TAAACAACAC CTCGAATTCA AAAAAGCCAT 
CGGCGAGCTT TTTAAGG CAG CTAGTCT CAT T AAATC CTAT TAACCCGCAG TGATCAGTAT 
CGTTGATGGC TGGTAGTTTC AGATGAAAAA TAG CAG CGGG CTCTAGAATA CCCTTGCAGA 
TGCCGGTACG GTAACAGAGG TCGCGGAAGC ATT C AT CG AT CACCCATAGC ATCCAATTGA 
GTCTCTGAAT GAGAAGATCC TTTTCAAACT CGGGGGCGTC CGGCAACTTG CCCCGCGTTC 
CAG AT AC CAG CAGTGAACCG ACCAGCAAGA GAG AC C AC AA CTTGAACCAG CACATGGCTG 
CTAACGCGGC ATACACTAGC CGGTGGTGCC CG AG CGGG AG TTACGAAGTC TCACTGAAGG 
GCGGGGTCGC GGGTCGGGGC CGCTCCAAAT CAGGCAACGC CGTATCCGAA CTCTGAGTCA 
CTTTTATGTA GGTCTCAAAC ATGTAAAAGA TACCACGTTC TTGAAAAACC CTCTCTTGCT 
CGCCAGGCTT GGGGTTCACG CGGGCATACG CAGCCAAGCT ATCATGCGAG AGAAACACGT 
CACACGCAAA GTCATGTAAA ACCCGGGTTA AAAATAGCCT AACTGGCCAG GGGCCAGTGA 
GCGCCTCCCG GTACAAGTCC CCACCCCCGA TGACCCAAAC CTTGTCAATT TGCTGTGCTA 
GCTCTGGGCT TCTCGCCAAC CCAAGCGCGG CATCGAGCGA ACTCG CC AAA AAGTGAGCAC 
CAGGGGGCGG GGTTTCTAAC GTGCGACTTA GAACCACATT GATTCTACCC GCCAATGGTC 
GACAGCCCGC GGGAATCGAA AGCCATGTGC GCCGCCCCAT AACAACGATC- TTTTGTTTTt 
CAGGGGCACA GTCGGTAGTC AG CTGTCGAA AACGCCTCAT GTCTCCCCGC AATGCAGGCC 
A-CGGGAGACA TCTGTTTTTT CCGATCCCGA GTTTGGTATC AACCGCAACT ACACAGTAAA 
G TG T AG GAT C CATGCCGCGA GGGTATAGGT AAACACCACC AACCACACAG TGTGCTCTTA 
TATACTTTTA AT G AAA CAT A AGGG C AG ACG AAACAGCCGA ACGTTTCCTA ATCACGCCCA 
TGGAACCATA GCCACCCCCA GGCAAACCCT GTGGAAGGAT AT CAACT AG A GAGGAGGGTC 
CAG C CTTATT ATGGCAGGAG A CA CTAT AAG CCCCATCGCC CGACTGGGCA CCAACATAAC 
CGCCACAGTA AGTGGCCCTA TACCGCTCAG CG CCC AAGTT GTTACAGTCA CACCCAACCG 
CGGTTGGCTC TACATTGTCA TCACGTCCAT CATTATGTGT TGGTTCTCCC GCTTCCTTGT 
ACCCTGCAGC TTCATCCACG GATTCTTCTG AGTCGCGATG CACAGGAGCG CCATCCGCGG 
GGCCATCTTG GTCGCCTGGA GCTGCCCCCG CGGGGCCATT TTGGTCGCCT GGAGCTGCCC 19020 
CCGCGGGCCC CTCCTCGTCC TGGTTATCCC CACGGGGAAG AATTTCCTGA AGCTCGATCT 19080 
CCTCTACTGC ACACTCTGGT GATGTCGGCC GAG GT CTAT A TGGAAACACT TCAACCCGCG 1914 0 
TGTTTACAGC AGCGTATGCC CGCCCCACGT GGCGCATCAT GTGGAAAAAC GCACCCAACC 
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CAAAAACGAC AAACAATTGG TAAAACACGA AAAAAACGTA GTACGCGGCT G 
GATCTATCTC TGGGTCATGA CCGCCCACTA TAT AT AG CCA AACCCACGTC GCAGCGGCAA 
GGCCAGCGGC CCCCAATGTC ATAATGAAAA TAAAAACAAT CAGTTCCAGA CCCTCCTG3T 
AAGTCAGCCG AGGCAATAGC GTCATTTCGC GCAAGGGTCG CCAGACCACG CGCGTGTTGT 
ATACGACGCC ACATATCTGA CAGGCCGTG7 TT CT AG AG AT AG TG AG CC AG GTGCTTAAAC 
AA=TT=TAT3 gacGTTCTCG AGCTCTCCTG TGCATCCACA GGCTCTAAAT CTCTCATTTC 
CGAGCTCCTC GTTGCAAATC CAGCAGACAG GAACATCCTC ATCTTCCATA TCCTGAGAGA 
GAACCCACAA TAAAACATGG CATTAACCCC TGCAACAAGT GACCGTACCA GG3CACGCG7 
CCAGGCAACC GGGGTCCCCC TCGTTGGTCT ATACAATTCC ATGACTACCT ACTGGTAATG 
CTACAGCCAC TCACTGTACA AG C CG G TT AA CTGGGAGGCG ACGCTGGCGT GGTATCGGCC 
AACTGAAACA CACCACTCCA CTCCAAACAC TTATGTACTT TGTGGCTCGG CTTTATTGTA 
ACAGCCAAGA GGGGCGTTTG TGGCTCAGCT TTATTGTAAC AG CC AAGAGG GACGTATGTG 
GCTATCTCAC AAAAAGTCAC CGATTCATGT AGACAACCCG CTCCCACGAA TTCGGTTTT7 
AAAAAGCCC7 CACGTATACA GACGGGCCAC TAAATACGCA CATGAGCGGG CATCCT3TTT 
CCGCCTTGAC GCCCACCACT CTGACCGCAC GCTAAACATC GCCCTACCTG CTATACTGCC 2010C 
ATTTCCATAC GAATGGTAGG ATGCGGGCAG TAGTCCACCA GTCTAAAATC ATCAGGTGTA 2CISC 
AACTCTTCCA TGGAAGAAAC AGACCGGAGT ATCTCCAGGC GCGGAAAGGG ACGTGGAGTG 20220 
CGCGTCA3CT GCAGCCGTAG TGGCTCTATA TGCGTTTTGT AGATGTGG GC ATCTCCCAAC 2C280 
GTGTGAATAA ACTCCCCGGG TCTAAGACCA G7AACATGAG CAAGCATATA AGTTAAGAGG 20340 
GAATAG CTGG CAATGTTAAA AGGAA.CTC CC AAACCCATGT CTCCCGACC7 CTGATACAGC 2 04 0C 
TGACAGGAAA GCTCACCGTC AG CT AC AT AA AATTGACATA ACAAGTGACA GGGC3GAAGC 2 0460 
GCCATCAACG ACAAGTCCGC CGGGTTCCAC GCACACATAA TGATTCTTCT ATCGTGCGGA 20S2C 
T __ WTT -. TA TT AAATCCAC AATGTACGAC AATTGGTCAA ACCCCTGGCC TGTATAGTCA 
GCATCCGCGT CCACGTACGC CGCCCCAAAG TGCCTCCACT G3AAACCGCA 
AAATCCCCCT CCCTTCTGTG CGCCAGGCCG CGCCCGGCCA GGAACTCCCT GGAGCCATTT 
TTGT C C CAT A TCTTGACTCC TGTTCTTGAA AGCTCCCTGG AGTCAGTACT CCCCTTCAGA 
AACCAAAGCA GCTCTTGCAC TACGCCTCGC CAAAACACCC GCTTTGTGGT TAGTAAGGGA 2082 c 
AAGTGGTCCC GCAGACTATA CCTG3CCTGC ATG C CAAATA GAGAGAGGGT C-CCTAT3CCG 20390 
GTGCGGTCGA GTCGATCGCT GCCACGGCAC AAAATTTCCC TCAACTGCC7 GAGATACTGA 2094 0 
AGTTCCCCGT GGGGCGTCTC AGCCCCAGT7 AC CT C ATG CT GAATCGAACA AGGGTCAACC HOOD 
TCGGGGGCCA AAGCCAAGAC GCCAGGCTTT TGACAGAAGC GAAACCCCC7 GGCACGGAAT 
AACTTTTTGG CGACATACAA GCTTAAAGGT ACAAACGGAA ACATGATAGA TCCTGGAAGT 
TTGTGAAG CC CTGTGCCCGG AG AG AC AC C C CTCAACTCGC AGTGCTCGGA GACCTACATG 

, , „.. rT A - AAAACACCGT ACGTAATACA 212 4 0 

TAT ACT C AG G CTCTTCTATA AAC^.vCC- — 
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CATTACTCAC AGT7CCCACG GTGACGCCCA AACCCATGCA CACGGGCGTG AT CG AT A CCA 
GAAAACATCA CAAGAACAAA AAGTGTGTGT CTGACATTCA CATTTATTTT TACAAGACA^ 
TTTTGTGCAG T AG AG TTGTG CCTTCCGACA CCCCGCGCCG TTCGCTGTTC TCCTGTAATT 
GGGAGATCCC ACTCCTTGGC AGGCACGTTT CACGAAACGC TCTTGTCTCG CTGGCCTTAG 
ACTTGTGGAC CCAACATGGG TATCGTTAGA GATCCGTCGC GTAAATGCGC AG CTG G C AAA 
GCATTCTTCA GCGAGCAGTG ACTGGTAATT G CTG CATC AG CTTCTTCACC CAGTCTTTCG 
ATTTGTCG3C ACACACCTGG CGACCACGCT TTGTCAAAAA TATCACACCC GG CTTGCTG C 
ACAGTTGGGA GGTGGGGTAC C AG CTG G AC A GAAGCACCTG TGGTAATGGT CTTTTCTGGT 
AACCGAGACA GCACTTGTCC GGTCTATGCC AGGACGCTCC CAGCGTGTCC CCAGATTGCA 
AACAAAGCAA GGCAGTCAGC ACAGCGACGA GCAGGATGCC CTTGGTGTCC ATAACTCCCC 21840 
TCGTGTGTCC TCGTGTAAAT GCGAAACGGC GATGTTAGGT CAGGCGCGGT AAACAGCTCA 21900 
ACTCGGTTCA AAACACGTAC GTGATGTAGT GCTGGTTCTA CGACGCCTAC CTGTAAACTC 
CAGGATCCTG G G CTTTT ATT ACGAAGGCCA ACACCCCAAA AAATCCACGC CCCCGTGACC 
GCAGGGGCGG TTACTAACGA CGGTTACAGC- TCCCTCCCGA GCCACGCACC TGCCATGTAA 
CCTGCAAGGT AACCAGACAA ACATCTAGGA AG C GTAAAT A TCCCCAGGTA GGAGAAGTAT 
TGCATATGTC ACAGACTCAA CACACACGGG CCGTTACGCA ACGGCTAGGG GCATAACCCT 
TTACCGGCGC GAAGCGCTAC GCGCTTCGCG AGAGGTATCT CCGTGTGCTT CTCCATCAGA 
AGACGCGTGC GCCGCTTCGC AGGCGACCCG CATACTTTCC GCCCCGAGTG CGTTACAAAA 
ATGACTGCCT TCTGGCGACA ATACACGGTG GACGTCCAGT ACCACCCGCA TATCAGCT7A 
TCCGGTGGCA ATCTGGCACT GGACAGGGAA 7TCTCGCAAC AATCCGAGGC CATGATGGTG 
GCAGGACCGC TGGCCGCACA TAGCTCAATC ACGGCCACCC AGAAGAGCAG CCCCAAATGT 
GCGCGCAACA CCCAGCACAT GCTCCACATA CAG7TCTGGC GCCACAACGA TGATGCGCAA 
AGGGGTG2AT TACCCTAAAT CCCAGCCTAG TTATAAATTA TTGAAGCCCA GGCGACCAGG 
GGTCGCCGCG CTTTTCCTCC CCAAACGCGA CGATAAAGAC CAGCGTTGCC AAATGTAAC7 
TATGTATAAC CCAAAATATT GCGCATCGAT AAGGTTTGCC AAAACACCCG AAAGTACACA 
CACAAAAAAA CAG CAACAAG ACGCTCACTA GACATTCACC CCTTCCCCCA CCCCCGAAAA 
CAAAACAACT TGACACAGGG G AAA C AC CAG GGGCGGCGGA GGTTGTCAAT AGTGTCCAGT 
ATTTCGTTAG ACGCGGGTTC TTGGACCCGA TGTCCCAGGT CATTAAAG7C TCAAATGGGA 
TTAAAGGATC ATAGTTCCCA GGTTTAATAC TCCAAGCTAT CCCAGAACAG GACCCCGGCA 
GAACCCC3CT TAACAGCACC AAA7CCAC7T GCGGTCCCAG AAAAGGTCGC CGAGGTGGCA 2 3 040 
AGGTGACTGA AAAGGTCATA GAGAGGACAC CGG7CCCATT TCCCACGG7C CAAAAATCCA 
GCGCGCCCCA CCGGC7TTCC GAGAACTTCG GCAAAGCTAA TTTGCATGCG C7AATCC7T7 
TATGTGCATA AATTATGTAG ATGAGGAGTC GCGCATGCGC AGAAAAATTC AGAGCGCCCG 
GGTGCACGGG G7CACC7CCA GGTCACGCCG CTAGGTGGGA CCGTGAGCGA CTCGAAAAAT 
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14 7 

TATAATTTTT GGCCA7TTCA TGGGCGCCGC CATCTTGAAT 7TGCTAATCC CCCATAATC7 
TCTGCCCCGC TCCCATTGGT CCGCCGGCCC GTCAATCAAA GTTTTCCGAG CCGCCATTGG 
CCCATCCGGC CGACCAATCC CGTTCGAGCT AGGCGACCGC G C C ATT C 3 AT TGGACGCCCC 2346C 
AGCCGTCAAT CAAATTCGGA GGCCTCCCAT TGGCCCCTAT CCCTAGAACT CCCAAGCTGA 23 520 
TTGGCCCAGA GCGGGAACCA ATCAGCGATT AGAGTTTTGT TTTGATTTTT C CT AT AT AT A 2 3 38 0 
TATATATAAT CCTTTAATCC TAGCGCAGCT GAGTCATCGC AG C C C CT ATT CCAGTAGGTA 2 3 640 
TACCCAGCTG GGTAATCCAG TAGGTATACC CAGGTGGGTG AACCCAGCTG GGTATACCCA 
GCTGCAATTC TATAATTAAA CAAGGTAGAA ACCAACGGGG TCCTCAGGTG GTATTTCCGG 
AAGCATTACC AAATAAGGCA ACCTCAGCTG GGAATACCAG CGGACTACCC CCAACTGTAT 
TCAACCCTCC TTTGTTTTCC GGAAGTATAT CCATTTATGG AAATCAGCTG GGTCACTCTA 
CTGGGTTATT CTTTATAATA GGGCCCGATG AGTCATGGGG TTGGGATTTT TCTACTAGGT 
CGTTTCGGTG GATGGGTGCC AGGATTATAG GGGCCCTGTC CACGGGGTTG T7CGGTGGCG 
GGGGGGGGGC TAGTGAGTCA CGGGCCTGGA ATCTCGCCTC TGGGTGGTTT CGGTAGATGG 
GGGCCGGGAG GATGGGGCCC CGCCCACCGC TGGCGCGCCC CAGAACATGG GTGGCTAACG 
CCTACATGGG CAGCTTGTCC TACGGTTACG CCCATTTGAG ACGGGTTAAC C AACTGTT A Z 
ACCCCTTCGC CGGGAACGCT ATAAAAACGA GGGACAGCAG CCCCCCCTCG CGCACTGCGC 
GCGCGGCGGC ACGTGGGACG GATCTCTTGG ATTTACCCGT AACGAGGAGC CCCGGCAGCA 
CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA 
CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA 
CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA 
CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA 24 54 0 
CCCCAGGAGC CCCGGCAGCA CCCCAGGAGC CCCGGCAGCA CCCCAGGAG2 CCCGGCAGCA 24600 
CCCCAGGAGC CCCGGCGCGC CACCCTCCCC GGAGGGGGAT CCCGGCGCGC CACCCTCCCC 
GGAGGGGGAT CCCGGCGCGC CACCCTCCCC GGAGGGGGAT CCCGGCGCGC CACCCTCCCC 
GGAGGGGGAT CCCGGCGCGC CACCCTCCCC GGAGGGGGAT CCCGGCGCGC CACCCTCCCC 
GGAGGGGGAT CCCGGCGCGC CACCCTCCCC GGAGGGGGAT CCCGGCGCGC CACCCTCCCC 24 84 0 
GGAGGGGGAT CCCGGCGCGC CACCCTCCCC GGAGGGGGAT CCCGGCGCGC CACCCTCCCC 24 900 
GGCAACAACC TGTTGCCATG TATGGCGATT T GT AT C AG T C ACAAGCACAC AACCCCTGCT 
AGTATTAATG GTGTTTAAAA CGTTCTACAC GTACGGCGGA CCGCATCCGT CGCAAGCACG 
CGCATATAAC CCCCAAATGC ACCATGATGA GAAGCACAGC CACGCGTCAA AAAACTTTAA -5080 
AAACATCGTT ATCCAATATC ATTAAAAACC ACACCGAAAT TTACACAGGT AGCACGTCA2 2 514 0 
CGTGTTAGTG TCACCCACTG TACACAAGGC GTGTCGTATA TGTAGTATAG GTATTTGATG 
AGGCGGAAGC ATATCCCGCT TCCAGCGAAC GGAAATAAGA ATCATCCGTT CCAGCATTTA 
TTCAAAGAGG GCACAGAGGA TTCACATTGT TTAGAGAGAG TTTTT 
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A7AC7TGGGC AG7ATTGGCC TACGATTTGG ■ GCGACGTTTC AGG7TGG7C7 A77C7C7G77 2 = 3 S C 
CACTTTTCCC CGGCTATTCT GTCCCAGCAT AGGCTCTTGA AA7AAACAA7 GTTTACCGAG 2 544: 
7AAAAGG77C CACTCACCCT CA777G7CGT TGCACCCATC CCCCC777GC 77AA7CA7CZ 255 0: 
GAAAACTAGA GGACACGGAT GGAAAACATA TCGCACGCGG GTTGTTTGAA AG7CAACAGC 2556: 
TACTTGTTT7 TAATGAGGAC AGATTTGGGC ACAGGCCAGA GGGTAAAGCC C7ACG7G7GC 25S2C 
GCGGGGGGGG GGGTGTATAC GC7GCGAAAA CCTGCACGGT GCATAACACC CAGGGCG72A 25660 
CG7CACA7A7 C7C7G7GCAC CCAAGTGGTT GTTCAACCGT TGTTTTTTGG A7GAT7777C 2 = 740 
CGCACCGGC7 TTTTTGTGGG CGCGCATAGG TCGGTACGCG C7GTCCCCC7 AAG7CCCGGA 2580C 
CGG7CG77CG GGCCCCCG7C CGGCTCGTCT CCGGATGAAC CGTCACGTTC 777G7C7C7A 2586C 
GAGGCGACGT CTCCTTCAGA TGACTCGTCC GTGGGCTCCT CGTCCGTCCC GCCCGCGGG7 2 5 920 
CCGACAAGGA CCGTCAATTC GATGTTATCT TCGTTCGCGG 77GGCCGGCG CGGCCGTCGG 25980 
TA7GGCAG7A CGGTCACCCG GGTGTTATTT GCCGCG7ATA A7GCCCTCAC AGTGCCAC7T 2604C 
ACGCGGCATA 7GCCGCCAAA 7GCAAACACA A7AAA7A777 GG7AAAACCC AAAGAAG7AG 
AGAAAACCGA GCACGGCCCC GGGGGAGAA7 G77CCCGCAG GAGCAG77AG GA7GACZAGG 
AGCGTCCAGG 7GCACAACGC CACGCCGACA AGCCCAGCCA CCACCACAGA C A7 "AG C AG A 
AACAG77CAA AAA777C77G GCGC7CCA7C 7CCGGCCACA GGT7AAGGCG ACTACGCCAC 
TGCGTGCGCG TGCGG7ATAT AACGCGACAC A 777 G AC AG G CCG7G777CG AGA2AC7G77 
AGCCAAG737 77AAACAC7G CGGG7GGACG ACA7CCAGC7 C7CCGG7ACA GGCGCAGGGG 
7G7A7GCC27 CG77CCCCAC C7C77CCCTA CA7A7CCAGC AGA7GGG7CC C7C7ACACCC 
TC77C7ACG7 CC77AGACGC CA7C7C7GCA GC7GGGGTGG AAG7C7GAAA AAGGGAAAGG 
GGAGG7GAGC AGAG7GCCCA G77AG777CC GACCCGCCG7 CCGCC77AC7 G7CGG7A7CC 
CGCC77GACA GA7G7C7AAC G7A77CACGG ACGCCACA7G 7G7G7C7A77 77CC7ACA7C 
CAGG7777CC C7GGAAAAC7 G7CAGAACCC ACCC7GC777 AGC7C7ACA7 C7G7A77777 
G777ACGCAC AGGA7CAACG C77CG7GCCC G7CCACCCCC GCGC7C7CCG 2C7G7G777G 
GAGG7777AT GAG7GG77AG 77C7AGGCAG C7CCGGACAA G77G7CCAAA A7ACGGCGCG 
CCCCGCGC77 CC77CCC7CC GGA7CCGCCC ACACCGGACC 7A7GAAA7AA GSGACACGCG 
TCA7CAC7AG 77A7GAGAGA AAAACCACAA CAGC777A77 GGAAAACACC 7GAG7G3A77 
C7CCACCCCC CGCG7ACGAC AGGCG777C7 G7GG7GCGC7 72TGGGAAAA A2G77777CC 
CCCA777C77 CC7CGACAGG 7C77C7AAGG 7AGA7AAA7C CCCCCCC777 GCGCG7C7CC 
7AGAA7G 3 CC TAGGCGCACG A7GGGG77G7 CGCC7CGAGC AG7TGGGCCG 2AG7GA7A7C 
77CAAC777C GACCGTCTAA GC7A7GGCAG GCAGCCGC7G CA7CAGC7GC C7AACCCAG7 
7777GGAAGG G7C7GCGCAG ATCTGACGCC C7CG777G37 CAGCAAAA7A A27CCGGG77 
T7GGGCACGC 7GGGGACG7G GGA7ACCAC7 C7T77AGAA7 TTGGACGGGC GG7GGG7GC7 
G37GGAACCC G7AGGAGCAG C7A77AGGCG TG7ACGACAC GAGTGACCCC GCGrT77C7G 
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TGGGCGTCAG GTAAAACGTG GGAAGCAGTA CGCTAACTGCA 3CATAAAAC3 733*33333= 
CCATCTGGAG GTG3CAAGTT CGCAACAGTC TAAAGAAAAC CGTAAAGGC7 A777GGGG77 
^ TGrrrcT ~- CRGA7GTAAC 3CCGAGTTCC 77ATATGCTT ACCTGATTCT G3T=T=A=-=T 
G7TTAT7TAT AGTGGCGTA7 GGTAACC3CC AGCTTACATG CGGGATAAG7 7GGC37AA37 =76 0C 
CACCAAAAAC G3G7TGCAGA CAAAAGTGAT TGTTGGGGCG CTTAC7TAGA AG37GTGAGG 
-^T-Mffi AACCCCGCCA ACG3CCG3AA AGCGCATGC3 77CCAGTCGG T3CG3CCT3C 
GCC3GC37CG CT3TG3CGCC 7T737GGGCT TTGAGTTCTG 7CAT7AAGCC AG3777CCA7 =- B u 
7GCCACCC33 GC3AAAACAA G7CG3GTAGT TTCAGGGGTC ATCTGGCGAT CAGTGTACCA 1784 0 
TATTCCCAC3 AC3CATCAAC ACC3=Ta=TT GAGGCGTGTC TCTGTATGTG 73A33GGAGA 
CTGCATGTAT CGT3CA7ATC TGTATTGTGC C-CTTGCGCGG AGACAACATA CCGACGACCA 
AG7CAGGGGT 3A3C73CAG7 GCACGCCGCT AGGTGGGACC GTGGGCGAGC CGAAATAA77 

ATAT; . TTT -. T ttggCACGGT tgtgagcaac gccatcgtga gttggttaat accctctaaa 

CG3ATAGTCT 777777 ATT7 GTCAACCAAC CAGTCAATCA CC7GTCATCG CCGCTCAGAA 
GCACACGTCT TC3G=CAATG CCGTGTTGGC GGGTTTGAC C ACGGTTACTG ATAGGTAGAC 
GAGTCCGACA ATCA3ACACG TCCG3CAGCG A77TGCAGCG CAGCTAAAAT CG3G7G3CCG 
GG7TGGTA3A AG C AAATT AT CCAAT3GTCG 7GTT7GG3TT TGTTTTGGGG 77ATC7ACAT 
A7TA7A77" 7TA7"CGAC TG3TTGCGGA AGTATTCGCA G CTTGG 7TAC T3TG37CGA7 
TACCCCGTGA ATAA37GGGC G3533GTGAC CCAACATAGT G ATT CG3TAG AT77333G3A 

M 3—TATTAAT GTTCATCCGT ATTGTGTATA T37AA7T7GG 2S500 

TTTCCATATT TGGTAGGAGT ATGGAGTTTT CTTATGGATT ATTAAGGGTC AGCTTGAAGG 2B560 
ATGATGTTAA T3ACATAAAG GGGCGTGGCT TCCAAAAATG GGTGG CTAAC CTGTCCAAAA 28620 
TATGGGAAC^ CTGGAGATAA AAGGGGCCAG CTT5AGTCAG TTTAGCACTG GGACTGCCCA 28580 
GTCAC CTTGG CTGCCGCTTC ACCTATGGAT TTTGTGCTCG CTGCTTGCCT TCTTGCCGCT 287,0 
TCTGGTTTTC ATTG3TGCCG CCGATTGTGG GTTGATTGCG TCGCTTTTGG CAATATACCC 
ATCCTGGCTT TCGGCTAGGT TTTCC3TCCT ACTTTTCCCA CATTGGCCTG AGAGCTGTAG 
TACAAAAAAC ACCGCGCGGT CTG3AGCTCT CCATAAGCCC GCAGAACAAA A3CTGCGAT. 
TGCCCAAAAA CCTTGCCATG GCAACTATAC AGTCACCCCT TGCGGGTTAT T3CATTGGAT 
TCAVTCTCCA GGCCAGTTGT AGCCCCCTTT TATGATATGC GAGGATACTT AACGTGTCTG 
AATGTGGAAT ATAATGTGAA AGGAAAGCAG CGCCCACTGG TGTATCAGAA CA3TGGTGCA 29,00 
CT AC CT AT IT GCTCATTCGT TGTTTCGGTT CTGTGTTTGT CTGATTCTTA 3ATAGTGTTG 291,0 
AGGTAATTCT AGAAAGCGGA TTGAGTGTAA ATCGGGCCAC TTTGCCCTAA ATGTGACAAT 2 9 220 
CT3GATGTGT AT CTT ATTGG TGCGTTGTGA AG C ATTTTAA AATGCGTTTT AGATTGTATC 
AGGCTAGTGC TGTAATGGTG TGTTTATTTT TCCAGTGTAA GCAAGTCGAT TTGAATGACA 
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TAAAGTGGTG TGCGGCAGCT GGGAGCGCTC TTTCAATGTT AATG7TTTAA TGTGTATGTT 
GTGTTGGAAG TTCCAGGCTA ATATTTGATG TTTTGCTAGG TTGACTAACC- ATGTTTTCTT 
GTAGGTGAAA GCGTTGTGTA ACAATGATAA CGGTGTTTTG GCTGGGTTTT TCCTTGTTCC- 
CACCGGACAC CTCCAGTGAC CAGACGGCAA GGTTTTTATC CCAGTGTATA TTGGAAAAAC 
ATG TT AT ACT TTTGACAATT TAACGTGCCT AGAGCTCAAA TTAAACTAAT ACCATAACGT 
AATGCAACTT ACAACATAAA TAAAGGTCAA TGTTTAATCC AT ATTT CCTG ACTTGTGTCT 
TGACTTGCGT CGATTGGGAT GGGGGTGTGG GATGGGGGTG TGGGATGGGG GTGTGGGATG 
GGGGTGTGGG ATGGGGGTGT GGGATGGGGG TGTGGGATGG GGGTGTGGGA TGGGGGTGTG 
GGATGGGGGT GTGGGATGGG GGTGTGGGAT GGGGGTGTGG GATGGGGGTG TGGGATGGGG 
GTAAATGACA ATGGGGGTAA ATGACAATGG GGCGCTTGGT GACACATTTG CCCCACCGTC 
GCCTGCCCGG AACCAGCTTG GTGATGTGCT GTCTGGCTCT CAGGTGCACT TTATGCAAAG 
CAGTTGAGGC GCATTAGATA TATAAAACTT GGGTACACAC CCTTGGTGCT GTGCGCGTGC 
TATGTGCCCT GGTGACCGTC CACAATGGAC GAGGACGTTT TGCCTGGAGA GGTGTTGGCC 
ATTGAAGGGA TATTCATGGC CTGTGGATTA AACGAACCTG AGTACCTGTA CCATCCTTTG 
CTCAGCCCTA TTAAG CT AT A CAT C AC AG G C TTAATGCGAG ACAAGGAGTC TTTATTCGAG 
GCCATGTTGG CTAATGTGAG ATTTC ACAG C ACCACCGGTA TAAACCAGCT TGGGTTGAGC 
ATGCTGCAGG TTAGCGGCGA TGGAAACATG AACTGGGGGC GAG CCCTGGC TATACTGACC 
TTTGGCAG7T TTGTGGCCCA GAAGTTATCC AACGAACCTC ACCTGCGAGA CTTTGCTTTG 
GCCGTTTTAr CTGTA7ATGC GTATGAAGCA ATCGGACCCC AGTGGTTTCG CGCTCGCGGA 
GGCTGGCGAG GCCTGAAGGC GTATTGTACA CAGGTGCTTA CCAGAAGAAG GGGACGGAGA 
ATGACAGCGC TATTGGGAAG CATTGCATTA TTGGCCACTA TATTGGCAGC GGTCGCGATG 
AG C AG GAG AT AACGCGTAAT TCGAGGTCCC CGGAAGAGTA GAGGGTTGCA TGTTATACAA 
ACAACATAAA CATTAAATGA ACATTGTTCA AAACGTATGT TTATTTTTTT TCAAACAGGG 
GAGTAGGGTA GGAAGGGTAC GTCTAATACG TAACTGTTCG CTACTGCTTG TTCAGGAGCT 
CCTCGCAGAA CATCTTGCGA ATTTTAGATT TTGGACTAGA GCGACTGCTG GCCTCAACGC 
GGTTCGATGT AGGGTTCGGC GTAGGAGCGT CTTTCTCCAC CGCCGCGCAT GGTGTATGCG 
TGGTCTCCGG TGCCTGTTGT TG G ATG CT C T GCGTGCTGGA GGCGGG3GTG GGTTCAGCG3 
GTGGTGCGCC AACTACCGCG AGTCCTGTAG AS ACTGG CGG GTGGCTCACA TGTGGCTGAG 
CAAAAAGGAT GGGCGCCGCT TGCTGGAACT GACCGTGTGG CGCCTGCACG TAAATGGGTG 
GGTGTACGTA GGTTCCTCCG TGCTCCTTCA TTGTCGGGAA TTGACACGGG ACCGCTGAAT 
TGGCGTGGGG CCTGTAGTGT GG AT CT AC TG CGGCTGCTGC TGCA3AGGAG GACGGC3GTG 
GCCCT3CGTG CCAACCGTTC AGTTTCATCT CTTTGAGTTC A3ACTGTATT TCTCGCTATGT 
TCTTTGACAT GGACAAGATA T C CTTGTG AT ACGCCGGCTC CTCTCCTG3A AA3AG3TGTC 31380 
CTTC3T::GTC ctctGCGCCG CGCTTGCGCT TCCCCGTCCT ATATCCAG3C A3CTGTGGCG 3144 0 
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215SC 



AGTAATACCA TGGATCGTAT GGGTTCTTGT AAG CG TAG CC GTATG3TG3C GC 
AAACATACGA AGGTAGGTGA TGGTCGGTGG GGAACATCTG GCCCCCACAC CCCATTAGGC 
CTGGCCCTGA AAGTGTATGT GACATTTTTG CCGCTGTGGT CTTCATTCCA TCGATGCTGC 3162C 
TTTGTAG CAT GCTCAGGAAG GCGGATTTGG GGATGGATAT GATATCCTCT TGACCAGAGC 3 16*0 
TGTTCATGGC TGGTCTGGGT GGTGTGACGG CTTGGATGCC GACCGG3AAT TGGCTGGCC7 
TTAAATACGC CGGGCTCAAT ATGCTGGCCA CACCTCTGTC AGTTTTCAAT AGGTCGAGGC 
GGTCCCGTAT G AAG C TG G C A TCTATAGCTT TTGCCATTAA GGTCTCCAGG GGACTGACGA 
AATTTGGTG 7 GGAAAGGTCC TCCAGCCTGC AG CT ACTT AC GTGCTGGAGG ATGTGGGCGC 
GCTCCGACTT AGATACTGAT GAGAATCTGG AAACCACCCA CTCGGCGTCG 7GTCCGTACA 
CGGCCAGTGT GCCGCGTCGG CGCCCCAGGG CGCATAGTGA TACGTGTTGA AACACGGGAC 
CGCTGGGAGT CTGGGATAAC TCGCGGGGAT GTATAGACGA TAAAGACAGC CCCGGGAGCC 
ACGTGTGGAG TATCTCCAAC AGTGGTTCCT TAGGGAGATT TTTCACGGGG GCTCTGGCCA 
CGTGGGAGGT GTCCGCCAGC CTGGATGCCA GCTCTAGGAA GGCTGGCGAC GTGATGGCTC 
CGGTGCAGAA AA7ACCGTGG GACACTTGAA AT AG AC C CAG TGTCCAGCCC ACTTCTGTCT 
CTGGTAGGTG TTCGATTGTT ATTGGAAGGG GTTCTGTGAC TGGGAGATAA TCCGTCACCT 
GATCCGGATC GAG AT AG AG C TCTTGCTCCA GCTTGGGGCA GGACACAACA TCTACAAACC 32400 
CTCCGACGTA CAGGCCCTGT GCCATGCTCG GAAAATACGT GTGTGAGACC GAGCCGCTGA 
GCCCGGGGCT TAGGAGGCTC ATGTGGCGCT TTTTGCAAAA TAAGAATTTA AATACATT C Z 
ACGCCCAAGA GC7GCGTTTT ATT C ATTTGG TTCTCTGCAG GATGTACAAT TTCGGTCTAA 
ATGTGTACCT GTTAAGGGAG GCTACTGCCA ATGCCGGGAC CTACGACGAG GTGGTCCTGG 
GACGCAAGG7 7CCTGCGGAG GTGTGGAAGC TCGTGTACGA TGGGCTCGAG GAGATGGGCG 
TGTCAAGTGA GATGCTGCTG TGTGAGGCAT ACCGGGACAG CCTCTGGATG CACTTGAACG 
ATAAGGTG G 3 GCTCTTGAGG GGCCTGGCGA ATTATCTGTT TCACCGGCTA GGGGTCACCC 
ACGACGTTCG CATCGCCCCG GAAAAC CTGG TGGACGGAAA CTTTTTGTTT AATCTGGGAA 
GTGTGCTCCC CTGCAGGCTG CTCCTTGCGG CGGGCTACTG CCTCGCCTTT TGGGGCAGCG 
ATGAACACGA ACGCTGGGTG CGCTTCTTCG CCCAGAAGCT TTTCATTTG Z TACCTGATAG 3300C 
TCTCCGGGCG TCTTATGCCA CAGAGGTCTC TGCTAGTTTG GGCCAGCGAA ACGGGCTATC 3 3 060 
CCGG7CCG3T GGAGGCAGTC TGTC3C3A-CA TCCGCTCCAT GTACGGCATA CGAACGTATG 
CGGTCTCGGG TTATCTTCCG GCTCCGTCCG AAGCG GAG CT GGCCTACCTT GC-TGCGTTTA 
ACAACAACG C GGTTTAAACG ACCG CGAGGA CCACCGGCAG GCAGCCAAGA ACCATAAAGT 
ACGCTCTATC GTAGTCATCG CCGCCGCCAA ACTGGGACTT GATAATCTCC TGGAGAAGGG 
TGGGTGGGGA TGGGTGTGAA AGCAGGACGT CCAGGCCCTC TTCTGTTGCC AGGCGGAGGG 
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ACACGGTCGA TCGCGCCTCG AGGGCGGCCA G7ATTATGCC AGGGAAGATG AAGGACACGC- 
GGGCGTTTGG ATTAGCCTGC AGTGTGGGGA TTATGTAGTG CTCCGATATG AACGAAAATA 
GCTGGCCCCT TTTCAG CATG GGGGCGTTTG GATCCGGTAG GGCACCGGGC TGAAATTTGG 
GTCCCAGCAG GGATAC CAGG TTCAAGCGGC GGTTTGGGTG CCCTCGCGCG ACTTGCCCAA 
ACTCCAGCAA TCCATACGCG AGGATAAACA CCTCCAGCGC AACAA7CCCC GC7CGCAGG7 
TCCACTGGTA TGCGGAAAAT GGTGGTATAT CGGACCCAAA CATGGCGCTC GTAATGGCGA - 
ATACCAAGTC CATGGCGGGC GCTGTCCCTG GCGCGCCCG7 ACCCTTGTTG TGGGGAAATA 
ATCCAGCCTT AG C CATC ATT GCGTGAAGCT-. TGTGGCGCTG GAAGAAGGCT GTCG3A7AGC 
GGCTCTCCT7 ATTGAGAGGC GCCAGCGAGG CGCGCTCCTG GGGGTTTGAG TATGTGAAGC 
TGAAGTCC CC AGGACCGCTT TCCTGTTTTA GCTGAGTGAT TAGCAGGTCT AGCTTTTGAG 
GCAGG7C7GC TAACAGGTCA TCGGGAGTAG CGGGCAGTTG CCTGGATGTC TTTTGACAAA 
AGTACG CGTT GACGAGGCAA AGCGCGGCCT GGGTGTCCGT GAGATGCCTG GCGTCGGCGA 
AAAAGTCAGC GGTGGTCGAG GCGACCGTCG TCAGGGTGTG AG AG ATG AG T TTGAGCGATC- 
TGGAATTCTG AAAGTTAACA GTCCCCTTTA GTTCTTTAGG GAAGACGCGC CGCTGCATGG 
CGTTGTCCGT GAGGCTGATG AACCACGGCC CAAAGGATGG CAACCACTGA TTCTGGTTCA 
TGTACAGGGT GGGCATGAGC TCGCCGCGCA GGTCCCTGTC AACGGAGAAG TGAGGG7CCC 
CGGGGACGAT CGCCACGGTG AAGTTACGGT GGC7GGCCTG CGGGGGGGA7 G7CAC7AAGG . 
GAGGCTCATG GGAACGGCTT TGGGGCATGT C7A7G7TGTC AGACCATGTC ATGTTGC7TA 
TCATCTGTT7 CACCGCGTCG ATATCTGCGT TAATGACGCC- GACGCG7GAG TCA7GGA7C7 
GAACAAGCCG GTCCAGCTCT AGGGAAAGCA GGTGTGCCTT TGTCTTTCGT 7CTCGA777C 
GCACGAG77G GCTGCGCAG7 CCAAGGGCGA CCC77C77G7 T7CTTCCATG 37GGGC77G7 
GAA7AAACAG CACGTT7TC0 GGGTGTGGGG CCCAGAA7C7 7CCCGCC777 G7CCA7777C 
GG7777T7GG GTACCT7AGA TAGGACCTTT C7GA7GTCAG CATT77C7C7 AGCAG7GAGA 
AAGGCGCACA ATTTTCCTTC GGTGGTGTGC ACCGGCGTGG GAAACGGCCC GGGTGATTCA 
GAGTATACTG TCTTTAGTGT 777C7GA77C 77AAA7A7CA GCAGGGGCG7 GATAGTCCAC 
GCCTCGGTAC CCGGAGGGGC CGAGTGAGCG ATGTAA7GGA TCGAGTCGGA GAG77GGCAC 
AGGCCTTGAG CTCGC7G7GA CG77C7CACG G7G77GG7TG GGA7CAGC7G G7GAC7CAGA 

(2) INFORMATION' FOR SEC 10 NO:18: 

(i^ SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 5100 base pairs 
;B) TYPE: nucleic acid 

(C) S7RANDEDNESS : double 

(D) TOPOLOGY: linear 

(ill MOLECULE 7YPE: DNA (genomic: 
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txi : SEQUENCE DESCRIPTION: SSQ ID NO: 18: 
CAAGTCT7GA GCTCTACAAC G TAAC AT ACG GGCTGATGCC CACCCGATAC CAGAATTACG 
CAGTCGGCAA TTCTGTGCCC T AG AG TC AC C TCAAAGAATA ATCTGTGGTC- TCCAAGGGGA 
GGG-TC-GGG GCCGGCTACT TAGAAACCGC CATAGATCGG GCAGGGTGGA GTAC7TGAGG 
AGCCGGCGGT AGGTGGCCAG GTGGGCCCGG TTACCTGCTC TTTTGCGTGC TGCTGGAAGC 24 0 

CTGCTCAGGG ATTTCTTAAC CTCGGCCTCG GTTGGACGTA CCATGGCAGA AGGCGGTTTT 
GGAGCGGACT CGGTGGGGCG CGGCGGAGAA AAGGCCTCTG TGACTAGGGG AGGCAGGTGG 
GACTTGGGGA GCTCGGACGA CGAATCAAGC ACCTCCACAA CCAGCACGGA TATGGACGAC 
CTCCCTGAGG AGAGGAAACC ACTAACGGGA AAGTCTGTAA AAACCTCGTA CATATACGAC 
GTGCCCACCG TCCCGACTAG CAAGCCGTGG CATTTAATGC ACGACAACTC CCTCTACGCA 
ACGCCTAGGT TTCCGCCCAG ACCTCTCATA CGGCACCCTT CCGAAAAAGG CAGCATTTT7 
GCCAGTCGGT TGTCAGCGAC TGACGACGAC TCGGGAGACT ACGCGCCAAT GGATCGCT7C 
GCCTTCCAGA GCCCCAGGGT GTGTGGTCGC CCTCCCCTTC CGCCTCCAAA TCACCCACC7 
CCGGCAACTA GGCCGGCAGA CGCGTCAATG GGGGACGTGG GCTGGGCGGA TCTGCAGGGA 
C7CAAGAGGA CCCCAAAGGG ATTTTTAAAA ACATCTACCA AGGGGGGCAG TCTCAAAGCC 
CGTGGACGCG ATGTAGSTGA CCGTCTCAGG GACGGCGGCT TTG CCTTTAG TCCTAGGGGC 
GTGAAATC7G CCATAGGGCA AAACATTAAA TCATGGTTGG GGATCGGAGA ATCA7CGGCG 
ACTGCTG7CC CCG7CACCAC GCAGC7TATG G7ACCGGTGC AC CT CAT TAG AACGCC7GTC- 
ACCGTGGACT ACAGGAATGT TTATTTGCTT TACTTAGAGG GGGTAATGGG TGTGGGCAAA 
TCAACGC7GG TCAACGCCGT GTGCGGGATC 7TGCCCCAGG AGAGAGTGAC AAGT7TTCCC 
GAGCCCATGG TGTACTGGAC GAGGGCATTT ACAGATTG7T ACAAGGAAAT 77CCCACC7G 
ATGAAGTCTG G7AAGGCGGG ■ AGACCCGCTG ACGTCTGCCA AAAT ATACT C AT3CCAAAAC 
AAGTTTTCGC TCCCC7TCCG GACGAACGCC ACC3CTATCC TGCGAATGAT GCAGCCC7GG 
AACGTTGGGG GTGGGTCTGG GAGGGGCACT CACTGGTGCG 7C7TTGATAG GCATC7CCTC 13 8 0 

TCCCCAGCAG TGGTGTTCCC TCTCATGCAC CTGAAGCACG GCCGCCTA7C 7T77GATCAC 144 C 

TTCT7TCAAT TAC777CCA7 CTTTAGAGCC ACAGAAGGCG ACG7GGTCGC CATTCTCACC 
CTCTCCAGCG CCGAGTCG77 GCGGCGGGTC AGGGGGAGGG GAAGAAAGAA CGACGGGACG 
GTGGAGCAAA ACTACATCAG AGAATTGGCG T3GGCT7ATC ACGCCG7GTA CTGTTCA73G 
A7CAT3T73C AG T A C AT C AC TGTGGAGCAG ATGG7ACAAC TATGCG7ACA AACCACAAA7 
A^CCGGAAA 7 CTG CTTC CG CAGCGTGCGC C7GGCACACA AG GAG G AAA C TTT 
CTTCACGAGC AGAGCATGCT ACCTATGATC ACCGGTGTAC TGGATCCC 

CCCGTCGTGA TCGAG CTTTG CTTTTGTTTC 77CACAGAGC TGAGAAAA7T ACAATTTATC 
GTAGCCGACG CGGATAAGTT CCACGACGAC G7A73CGGCC TGTGGACCGA AA7C7ACAGG 
CAGA7CCTG7 CCAA7CCGGC TATTAAACCC AGGGCCA7CA AC7GGCCAGG ATTAGAGA3G 
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CAGTCTAAAG CAGTTAATCA CCTAGAGGAG ACATGCAGGG TCTAGCCTTC TTGGCGGCCr 204 
TTGCATGCTG GCGATGCATA TCGTTGACAT GTGGAGCCAC TGGCGCGTTG CCGACAACG5 
CGACGACAAT AACCCGCTCC GCCACGCAGC TCATCAATGG GAGAACCAAC CTCTCCATAG 
AACTGGAATT CAACGGCACT AGTTTTTTTC TAAATTGG CA AAATCTGTTG AATGTGATCA 
CGGAGCCGGC CCTGACAGAG TTGTGGACCT CCGCCGAAGT CGCCGAGGAC CTCAGGGTAA 
CTCTGAAAAA GAGGCAAAGT C TTTTTTT C C CCAACAAGAC AGTTGTGATC TCTGGAGACG 
GCCATCGCTA TACGTGCGAG GTGCCGACGT CGTCGCAAAC TTATAACATC ACCAAGGGCT 2 4 0C 

TTAACTATAG CGCTCTGCCC GGGCACCTTG GCGGATTTGG GATCAACGCG CGTCTGGTAC 
TGGGTGATAT CTTCGCATCA AAATGGTCGC TATTCGCGAG GGACACCCCA GAGTATCGGG 
TGTTTTACCC AATGATTGTC ATGGCCGTCA AGTTTTCCAT ATCCATTGGC AACAACGAGT 
CCGGCGTAGC GCTCTATGGA GTGGTGTCGG AAGATTTCGT GGTCGTCACG CTCCACAACA 264 0 
GGTCCAAAGA GGCTAACGAG ACGGCGTCCC ATCTTCTGTT CGGTCTCCCG GATTCACTGC 
CATCTCTGAA GGGCCATGCC AC CTATG ATG AACTCACGTT CGCCCGAAAC GCAAAATATG 
CGCTAGTGGC GATCCTGCCT AAAGATTCTT ACCAGACACT C CTTAC AG AG AATTACACTC 
GCATATTTCT GAACATGACG GAGTCGACGC CCCTCGAGTT CACGCGGACG AT CC AG ACTA 
GGATCGTATC AATCGAGGCC AGGCGCGCCT GCGCAGCTCA AGAGGCGGCG CCGGACATAT 
TCTTGGTGTT GTTTCAGATG TTGGTGGCAC A CTTT CTTGT TGCGCGGGGC ATT A C C G AG C 
ACCGATTTGT GGAGGTGGAC TGCGTGTGTC GGCAGTATGC GGAACTGTAT TTTCTZCGCC 
GCATCTCGCG TCTGTGCATG CCCACGTTCA CCACTGTCGG GTATAAC CAC ACCACCCTTG 
GCGCTGTGGC CGCCACACAA ATAGCTCGCG TGTCCGCCAC GAAGTTGG CC AGTTTGCCCC 
GCTCTTCCCA GGAAACAGTG CTGGCCATGG TCCAGCTTGG CGCCCGTGAT GGZGCCGTCG 
CTTCCTCCAT TCTGGAGGGC ATTGCTATGG TCGTCGAACA TATGTATACC GCCTACACTT 
ATGTGTACAC ACTCG3CGAT ACTGAAAGAA AATTAATGTT GGACATACAC ACGGTCC7CA 
C CG AC AG CTG CCCGCCCAAA GACTCCGGAG TATCAGAAAA G CTACTG AG A ACATATTTGA 
TGTTCACATC AATGTGTACC AACAT AG AG C TGGGCGAAAT GATCGCCCGC TTTTCCAAAC 
CGGACAGCCT TAACATCTAT AGGGCATTCT CCCCCTGCTT TCTAGGACTA AGGTACGATT 
TGCATCCAGC CAAGTTGCGC GCCGAGGCGC CGCAGTCGTC CGCTCTGACG CGGACTGCCG 
TTGCCAGAGG AACATCGGGA TTCGCAGAAT TGCTCCACGC GCTGCACCTC GATAGCTTAA 
ATTTAATTCC GGC3ATTAAC TGTTCAAAGA TTACAGCCGA CAAGATAATA GCTACGGTAC 
CCTTGCCTCA CGTCACGTAT AT CAT C AG TT CCGAAGCACT CTCGAACGCT GTTGTCTACG 
AGGTGTCGGA GATCTTCCTC AAGAGTGCCA TGTTTATATC TGCTATCAAA 
CCGGCTTTAA CTTTTCTCAG ATTGATAGGC ACATTCCCAT AGTCTACAAC A 
CAAGAAGAGG TTGCCCCCTT TGTGACTCTG TAATCATGAG CTACGATGAG AGCGATGGCC 
TGCAGTCTCT CATGTATGTC ACTAATGAAA GGGTGCAGAC CAACCTCTTT TTAGATAAGT 
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CA-CTTTC7T TGATAATAAC AACC7ACACA 7T3A77A777 G73G3TGAGG GACAACGGGA 
CCGTA3TGGA GATAAGGGGC ATGTATAGAA GACGCGCAGC CAGTGC7T7G TTTC7 AA77 2 
TCTCT7T7A7 TGGG77CTCG GGGGT7A7C7 AG77TC7T7A CAGACTG777 7CCA7C3777 
A77AGACGG7 CAATAAAGCG TAGA77T77A AAAGGTTTCC TG7GCA7TC7 7777G7A7GG 
GCATATACTT GGCAAGAAAT CCGAGCACC7 CAGAAAG7GG A77GCCG7CA CA7A7CAG77 
CGACCACCCC TGCACCTAGC CATGCGGCGC T77GACGG7C 777GGGGC7A GACATCATAA 
AGTACTT7TC CATGGC7TCT ATAAGCACC7 TGGAACAA7C TGGGGGTTG3 CGAATGG3T7 
CC37AAA3GG GAAATCCTC7 ATGG7ATTCA GGCAGAAGAC CGCGTCCTCC ACCCGACGT7 
TGAGTC777C TAG C AG AG C G CCGAAGAACT CCC3CTC3TG TGT77TCGGA GGGGCAAG77 
CTGCGCCGTA C AG C G AT GAG AAACACGACA CG A7GTTTTC CAGCCCCATG CT3CGCAGCA 
ACACGTGC77 CAGGAACAGG TGTTG7AGCC GG7TCAGTTT TAGCTTGGG7 AGAAAAG7TA 
TCGAG7737T AGCA3G3TCC ATGATGGTAA CGGTGTTGAA G7CAC AG AC C GG3C777C7C 
CGA3TCTCGG CCGCC7GAGT CCAA7CATG7 AGAACATAGA CG3G3CCTC3 TT373T3737 
TAAGTGACAC GA7ATCCC3T TCGCAAACCT GTGCGATGTT GTGTTT3AG7 A7A3A777GG 
TC7GACCGGC ACGGGGTG77 ATGGGGTGAC GCGGTAAAGG CGACTCTGGG TCAAACACC7 
TTATG GGG77 GGCGGCCTCG TCGATGACGA CACGC7TGTT CGCGGCGTGT A7GGGGACGC 
GACGGCATCC CG 373 3 C AG A 7CTATAATCT TAAAG77GG7 A7AAGAC7GG 7CGC7CG77A 
T3GCCA3CC3 GCAC7CCGG7 A37ATC73CG 7G7CC7CGAA 77CG7GGCCG GG7AC3A37G 
GC7T33A373 CAGG7AAACG CCAAGAGA7G CGG7C7C7TC GZCTACGCAC AAG733377C 
7TAACGCG7A GGGGTGCGGT GAG AG CA7GA TCG37AGCAA CGATAG7T22 G3373C77A3 5220 
CCGCG7AGA3 TGGCAGGG7A 3ACGAGTCCG GA37CCCAAA C7777C3AAC AACA37333A 
TCGGGAC77C AGGA77AGAG ACTCCCAC2A TGGCC3CCAC CGCCGGAGAG GTCAA3ACG7 
GAAACACGCG C7C3CC7G7C GA3AGGCGCG CCGCGCCCT3 7AC7AGAC7A 33C773AC37 
CCGGAAC7CG 7AACA7ACC7 TAGACGAGCG GAC33AC3CA AC37AC3733 33ATC333T3 
GCGGTGTCTG C7CG77GGAC GCGGCCGTTC GGTGG2GCCA 37G3A33CC7 A37773 CGAA 5 52 0 

TGGCGTGACG GACAA7T7GT G3C777AGAG CG3C3AACCG A73ACCCG7G G7GGCGAC3A 5E6 0 

ACGAAATGAA G777GCA77G CGGCC3AAC7 C37CTAGC- j. ^ - - «. 

AGA77TTCGG GA7TAGG77A CAC7T777AT ATCCCAGxAC 7G232AC7CG 73777G3777 570C 
7A3TG7GACT GA7TA7C7TC 77TGAGAAG7 CAAACAGGCC CC33GC3G3G GC72GGC7AA 5760 
TGCAAGCCAC GTCAAGCCTG AGAAACGAAC A3CA77CCAC CAGACAC7C2 AGGAACG777 5S2C 
TGTG7AG337 CTG7A777GG GAACGG777C 73733TCAAG 7A333A3AA7 A7737A77 



TG777CCGTC GATGCGCGCG TGC7G37CCG 7GAGAATGGG C333A337; 
G77CCACAA3 AGGC7GCCCG 7ACACT77AG AAA7CGTGGC T37CG33G3C 77AAAC2A3G 
ACAC3777AG C3CAT3C77G CTG3AGACCA CAGA7GGAAA G777GTG373 GAAAA7A3G7 



-jo?: 
414 : 

42CC 
42SC 
4320 
4380 
444 0 
4 5 0C 
4560 
4620 
4680 
4740 
4800 

4eso 

4S20 
4980 
5040 
510C 
5160 



52BC 
534C 
54 0 0 
5460 



58SC 



TG333AA7C7 5 94 0 



6 0 00 
606C 



PCT/US97/13346 

WO 98/04576 



156 

TTTTTCGCCC CATTCTCACC ATGTACTGGT TTTCCAGTCC GTGCAGGTCC AACGTGGAGT 
TCCAATTTGC T AT CG AT AC A GGAAATATGT GCCTGATTGG CAGAAAGCAT TTCAGCGTAC 
CCATTGCGAA G AG AAAGTG C AGCATGTCCC CACTGATGTT GATGTTTAT7 GCGGTGCCT7 
GACACATGTT GTCGGAAAAA AACACGCTTA TGGTAAAAGA AGGTTCCTTT ACGGAGTACT 
TTCGTATAAC AAAATTGTTG GTCAATCTGG GGATGTTTAA AATAGTCTTT TGCAGGGTG7 
TAGGAACGTG GCAGCTTATC TTAGTGTTAA TCACCATGTT GGTGTTGAAT ATGGTGATCT 
TGAAGTTTTC CAAACTGACG TGTTTTGTGG GTTCCAGCAT GTCTGACACT G TAG AG CTG Z 
CCAGAG7CCG CGCGTCCGTG GCCGCGTATC GTTGGAAGCA CG C CTG C AAA TTTCCTTTCA 
TGGCTGCTCG CCGGTCTTTC GGCGCGTACC GGATTCTTGA AAGCGTCGCC GCCAGGAGAC 
GCGGTGTCTC GTGGGTGCCT AAAAAGTTTG CGCAGGGGTG CAGTCCGCTG CACGAGTGGC 
CGATGCAGTC TGCCACTGCC ATACACATGA CGAGTCTGTA GATGGCCGGT GTGCCCGGAT 
ACACTAGATA GTAGGTACAA TCTGGGGTAC TGACGACCAC CCTGTATGGC TTTGGTCCGG 
GGTCCTTGCG TTGGATTTTT ACGTGCAGAC GGGACACGAG CTGGTTTAGA G C C AG CTG AA 
AGCCCACCAG ATCCCGTCCG TTAACCTTGA CGTCCTGGTG CTTACTCTGT TTCGACAGGT 
TCTTCAGCAC GGTGG3CAGT CGCTCTACGT TGTGAGCGAT GGCACGGCGC AGCGAGACCA 
GCTCTCCGTG CCACCCCCAC GTGGCCATGA AG CTG CTG AT GTTAAACTTT AAAAAATGTA 
GCTGTGCGTC TGGGGATGCG GGTGGCATTA TTGAAAACGA GAGATGCTTC AGGCTCTCCA 
GGAGTGCAAA ATAATTTTGA TAGATTGTGG GTTGTAGACT ATGGGGCAAC ACCGCCAGAA 
ACGCATGAAA ACAC7GTTCG AA CT C C C AG A ACTCCAGGTA CCTGCACACT ATCCTGAACA 
TGGCTTTGTA ACATATGGTG CACGTTAGTA GCGCGGGAAG ATACAGCGAG CGTAGCTCCC 
TGAATTCGCA GGGTT7A7CA CAATCATCGG TAAGTTCCCA TGATCCCACC GCAGGTAGGT 
AGTTGTCGGT GTC7ATCTGT CCGCG2GTAA ACACTCCACC ACCGTCAATT ATTAAACCTT 
CGCCGCTGTA CCGTC3ACCC ACTTTTCCCA AAAGAGTCCC TTCTTGATGT ATAAAAGGGT 
GGAGGCGTTC CCCCA3GAGT AGTCTGCGTA TCGCTCTGCA GGCGAAAAAG GTGGGCTCGG 
GCTGCATCAT CTTATCAAGA CCTTCTAAGG TCAGCTCTGC CTGCAGGTGC GAGTTGGTGG 
CCAGACAGCA GAATATTTCC AGCTGTGATT CCCAAGTCGC TTGATAACAC GTGGTCTGCG 
GACTCGTCGT CAGGGAGGCG CTCGG7GGCA GTAGTAGGGG GCCCTCGAGC GCTGCCATGG 
AGGCGACCTT GGAGCAACGA CCTTTCCCGT ACCTCGCCAC GGAGGCCAAC CTCCTAACGC 
AGATTAAGGA GTCGGCTGCC GAC3GACTCT TCAAGAGCTT TCAGC7A77G C7CGGCAAGG 7 8 0C 
ACGCCAGAGA AGGCAGTGTC CG777CGAAG CGC7ACTGGG CGTATATACC AATGTGGTGG 7860 
AGTTTGTTAA GTTTCTGGAG ACCGCCCTCG CCGCCGCTTG CGTCAATACC GAGTTCAAGG 7 92 0 

ACCTGCGGAG AATGATAGAT GGAAAAATAC AGTTTAAAAT TTCAATGCCC ACTATTGCCC 
ACGGAGACGG GAG3AGGCCC AACAAGCAGA GACAGTATAT CGTCATGAAG GCTTGCAATA 
AGCACCAZAT CGG7GCG3AG ATTGAGCTTG CGGCCGCAGA CATCGAGC7T CTCTTCGCCG 
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GGACGCCGAT CACACATCTG GTTTCGGCTC TCCTCGACCC GCATCTGCTG ZZ 
CCTACCACGA TGTCTTTACG GATCTTATGC AGAAGTCATC CAGACAACCC ATAATCAAGA 
TCGGGGATCA AAACTACGAC AACCCTCAAA ATAGGGCGAC ATTCATCAAC CTCAGGGGTC 
GCATGGAGGA CCTAGTCAAT AACCTTGTTA ACATTTACCA GACAAGG3TC AATGAGGACC 
ATGACGAGAG ACACGTCCTG GACGTGGCGC CCCTGGACGA GAATGACTAC ^ZZCSGTCZ 
TCGAGAAGCT ATT CT ACT AT GTTTTAATGC C3GT3TGCAG TAACGGCCAZ ATGTGCG3TA 



E16< 

szz: 
52s; 



AGAAAGAGAC GCCCTTGGAC TTCACAGAGT ACGCGGGTGC CATCAAGACG ATTACGTCGG 
CTTTGCAGTT TGGTATGGAC G C C C TAG AAC GGGGGTTAGT GGACA0GG7T CTCGCAGTTA 
AACTTCGGCA CGCTCCACCC GTCTTTATTT TAAAGACGCT GGGCGATCCC GTCTACTCTG 
AGAGGGGCCT CAAAAAGGCC GTCAAGTCTG AC AT GGT AT C CATGTTCAAG GCACACCTCA 6 34 0 

TAGAACATTC ATTTTTT CTA GATAAGGCCG AGCTCATGAC AAGGGGGAAG CAGTATGT C Z 64 OC 

TAACCATGCT CTCCGACATG CTGGCCGCGG TGTGCGAGGA TACCGTCTTT AAGGGTGTCA 
GCACGTACAC CACGGCCTCT GGGCAGCAGG TGGCCGGCGT CCTGGAGACG ACGGACAGCG 
TCATGAGA-G GCTGATGAAC CTGCTGGGGC AAGTGGAAAG TGCCATGTCC GGGCCCGCGG 
CCTACGCCAG CTACGTTGTC AGGGGTGCCA ACCTCGTCAC CGCCGTTAGC TACGGAAGGG 
CGATGAGAAA CTTTGAACAG TTTATGGCAC GCATAGTGGA CCATCCCAAC GCTCTGCCGT 
CTGTGGAAGG TGACAAGGCC GCTCTGGCGG ACGGACACGA C GAG ATT GAG AGAACCCGCA 
TCGCCGCCTC TCTCGTCAAG ATAGGGGATA AGTTTGTGGC CATTGAAAGT TTGCAGCGCA 
TGTACAACGA GACTCAGTTT CCCTGCCCAC TGAACCGGCG GATCCAGTAC A C CT ATTT C T 
TCCCTGTTGG CCTTCACCTT CCCGTGCCCC G CT ACT CG AC ATCCGTCTCA GTCAGGGGCG 
TAGAATCCCC GGCCATCCAG TCGACCGAGA CGTGGGTGGT TAATAAAAAC AACGTGCCTC 
TTTGCTTCGG TT AC CAAAAC GCCCTCAAAA GCATATGCCA CCCTCGAATG CACAACCCCA 
CCCAGTCAGC CCAGGCACTA AACCAAGCTT TTCCCGATCC CGACGGGGGA CATGGGTACG 
GTCTCAGGTA TGAGCAGACG CCAAACATGA AC CTATTCAG AACGTTCCAC CAGTATTACA 
TGGGGAAAAA CGTGGCATTT GTTCCCGATG TGGCCCAAAA AGCGCTCGTA AC CACGG AGG 
ATCTACTGZA CCCAACCTCT CACCGTCTCC TCAGATTGGA GGTCCACCCC TT CTTTG ATT 
TTTTTGTGCA CCCCTGTCCT GGAGCGAGAG GATCGTACCG CGCCACCCAr AGAACAATGG 
TTGGAAATAT ACCACAACCG CTCGCTCCAA GGGAGTTTCA GGAAAGTAGA GGGGCGCAGT 
TCGACGCTGT GACGAATATG ACACACGTCA TAGACCAGCT AACTATTGAC GTCATACAGG 9480 
AGACGGCATT TGACCCCGCG TATCCCCTGT TCTGCTATGT AATCGAAGCA ATGATTCACG 9540 
GACAGGAAGA AAAATTCGTG ATGAACATGC CCCTCATTGC CCTGGTCATT CAAACCTACT 
GGGTCAACTC GGGAAAACTG GCGTTTGTGA ACAGTTATCA CATGGTTAGA TTCATCTGTA 
CGCATATGGG GAATGGAAGC ATCCCTAAGG AGGCGCACGG CCACTACCGG AAAATCTTAG 
GCGAGCTCAT CGCCCTTGAG CAGGCGCTTC TCAAGCTCGC GGGACACGAG ACGGTGGGTG 
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TGGGGGTCGA CTATCAAAAC GTGGCCCTGA CGCTGACTTA CAACGGCCCC GTCTTTGCGG 

ACGTCGTGAA CGCACAGGAT GAT ATT CT AC TGCACCTGGA GAACGGAACC TTGAAGGACA 

TTCTGCAGGC AG G C G AC ATA CGCCCGACGG TGGACATGAT CAGGGTGCTG TGCACCTCG7 

TGACGTG CCCTTTCGTC ACCCAGGCCG CTCGCGTGAT CACAAAGCGG GACCCGGCGZ 



TTl 
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102s: 

1C22C 
1038C 
1044C 
105OC 



AGAGTTTTGC CACGCACGAA TACGGGAAGG ATGTGGCGCA GACCGTG CTT GTTAATGGCT 
TTGGTGCGTT CGCGGTGGCG GACCGCTCTC GCGAGGCGGC GGAGACTATG TTTTATCCGG 
TACCCTTTAA CAAGCTCTAC GCTGACCCGT TGGTGGCTGC CACACTG CAT CCGCTCCTGG 1C560 
CAAACTATGT CACCAGGCTC CCCAACCAGA GAAACGCGGT GGTCTTTAAC GTGCGATCCA 
ATCTCATGGC AGAATATGAG GAATGG CAC A AGTCGCCCGT CGCGGCGTAT GCCGCGTCTT 
GTCAGGCCAC CCCGGGCGCC ATTAGCGCCA TGGTGAG CAT GCACCAAAAA CTATCTGCCG 
CCAGTTTCAT TTGCCAGGCA AAACACCGCA TGCACCCTGG TTTTGCCATG ACAGTCGTCA 
GGACGGACGA GGTTCTAGCA GAGCACATCC TATACTGCTC CAGGGCGTCG ACATCCATGT 
TTGTGGGCTT GCCTTCGGTG GTACGGCGCG AGGTACGTTC GGACGCGGTG ACTTTTGAAA 
TTACCCACGA GATCGCTTCC CTGCACACCG CACTTGGCTA CTCATCAGTC ATCGCCCCGG 
CCCACGTGGC CGCCATAACT ACAGACATGG GAGTACATTG TCAGGACCTC TTTATGATTT 
TCCCAGGGGA CGCGTATCAG GACCGCCAGC TGCATGACTA TATCAAAATG AAAGCGGGCG 
TGCAAACCGG C7CACCGGGA AACAGAATGG ATCACGTGGG ATACACTGCT GGGGTTCCTC 
GCTGCGAGAA CCTGCCCGGT TTGAGTCATG GTCAGCTGGC AACCTGCGAG ATAATTCCCA 
CGCCGGT2AC ATCTGACGTT GCCTATTTCC AGACCCCCAG CAACCCCCGG GGGCGTGCGG 
CGTGCGTGGT GTCGTGTGAT G CTT A C AG T A ACGAAAGCGC AGAGCGTTTG CTCTACGACC 
ATTC AAT AC C AGACCCCGCG TACGAATGCC GGTCCACCAA CAACCCGTGG GCTTCGCAGC 
■GTGGCTCCCT CGGCGACGTG CTATACAATA TCACCTTTCG CCAGACTGCS CTGCCGG3CA 
TGTACAGTCC TTGTCGGCAG TTCTTCCACA AGGAAGACAT TATGCGGTA2 AATAGGGGGT 
TGTACACTTT GGTTAATGAG TATTCTGCCA GGCTTGCTGG GGCCCCCGCC ACCAGCACTA 11580 
CAGACCTCCA GTACGTCGTG GTCAACGGTA CAGACGTGTT TTTGGACCAG CCTTGCCATA 
TG CTGCAGG A GGCCTATCCC ACGCTCGCCG CCAGCCACAG AGTTATGCTT GACGAGTACA 
TGTCAAA2AA GCAGACACAC GCCCCAGTAC ACATGGGCCA G TAT CT 2 ATT GAAGAGGTGG 
CGCCGATGAA GAGACTATTA AAGCTCGGAA ACAAGGTGGT G T ATT AG CT A ACCCTTCTAG 
CGTTGGCTAG TCATGGCACT CGACAAGAGT ATAGTGGTTA A2TTCACCTC C AG ACT CTT C 21880 
GCTGATGAAC TGGCCGCCCT TCAGTCAAAA AT AG G GAG C G TACTGCCGC7 Z GG AG ATTG C 1194 0 
CACCGTTTAC AAAATATACA GGCATTGGGC CTGGGGTGCG TATGCTCACG TGAGACATCT 12000 
CCGGACTACA TCCAAATTAT GCAGTATCTA TCCAAGTGCA CACTCGCTGT C2TGGAGGAG 
GTTCGCCCGG ACAGCCTGCG CCTAACGCGG ATGGATCCCT CTGACAAC2T T 2 AG ATAAAA 
AACGTATATG CCCCCTTTTT TCAGTGGGAC AGCAACACCC AGCTAGCAG7 GCTACCrCCA 
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^ TTTTAGC = GAAAGGATTC CACCAT7GTG CTCGAATCCA ACGGATTTGA CCTCGTGT7C 



:ac gctattctgc agcagctgtt GGTGTACCA3 



CCCATGGTCG TGCCGCAGCA ACTGC 
ATCTACTCCA AAATATCGGC CGGGGCCCCG GATGATGTAA A7ATGGCGGA A.CTTGAT CTA 
TATACCACCA ATGTGTCATT TATGGGGCGC ACATATCGTC TGGACGTAGA CAACACGGA7 
CCACGTACTG CCCTGCGAGT GCTTGACGAT CTGTCCATGT ACCTTTGTAT CCTATCAGCC 
TTGGTTCCCA GGGGGTGTCT CCGTCTGCTC ACGGCGCTCG TGCGGCACGA CAGGCATCC7 
CTGACAGAGG TGTTTGAGGG GGTGGTGCCA GATGAGGTGA C C AG GAT AG A TCTCGACCAG 
TTGAGCGTCC CAGATGACAT CACCAGGATG CGCGTCATGT TCTCCTATCT TCAGAGTCTC 
AGTT C TAT AT TTAATCTTGG CCCCAGACTG CACGTGTATG CCTACTCGGC AGAGAC77TG 
GCGGCCT2CT GTTGGTATTC CCCACGCTAA CGATTTGAAG CGGGGGGGGG GTATGGCGTC 
ATCTGATAT7 CTGTCGGTTG CAAGGACGGA TGACGGCTCC GTCTGTGAAG TCTCCCTGCG 
TGGAGGTAGG AAAAAAACTA CCGTCTACCT GCCGGACACT GAACCCTGGG TG G TAG AG A C 
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CGACGCCAT2 AAAGACGCCT TCCTCAGCGA CGGGATCGTG GATATGGC 

TCGTGGTGCC CTGCCCT2AA ATTCTCACAA CGGCTTGAGG ATGGTG CTTT TTTGTTATTG 
TTACTTGCAA AATTGTGTGT ACCTAGCCCT GTTTCTGTGC CCCCTTAATC CTTACTTGGT 
AACTCCCTCA AGCATTGAGT TTGCCGAGCC CGTTGTGGCA CCTGAGGTGC TCTTCCCACA 13140 
CCCGGCTGAG ATGTCTCGCG GTTGCGATGA CGCGATTTTC TGTAAACTGC C CT AT AC C G 7 
GCCTATAATC AACACCACGT TTGGACGCAT TTACCCGAAC TCTACACGCG AGCCGGACGG 
CAGGCCTACG GATTACTCCA TGGCCCTTAG AAGGGCTTTT GCAGTTATGG TTAACACGTC 
ATGTGCAGGA GTGACATTGT GCCGCGGAGA AACTCAGACC GCATCCCGTA AZZACACTGA 

— — r- G . — -TAG^TC ACAACTGTCA 1344 0 

GTGGGAAAAT CTGCTGGCTA TGT.TiCTGi ^,-,^A. G — .*«w»ww 

CCCGGAAG2A CTGT27ATCG CGAGCGGCAT CTTTGACGAG CGTGACTATG GATTATTCAT 13500 
CTCTCAGC2C CGGAG2GTG2 CCTCGCCTAC CCCTTGCGAC GTGTCGTGGG AAGATAT2TA 
CAACGGGACT TACCTAGCTC GGCCTGGAAA CTGTGACCCC TGGCCCAATC 7ATCCACCCC 
TCCCTTGATT CTAAATTTTA AATAAAGGTG TGTCACTGGT TACACCACGA TTAAAAACCA 
CTCACTGAGA TGTCTTTTTA ACCGCTAAGG GATTATACCG GGATTTAAAA CCGCCCACTG 
ATTTTTTTAC GCTAAGAGTT GGGTGCTTGG GGGGTTTTGC ATTGCTCTGT TGTAAACTAT 
ATATAAGTTA AACCAAAATT CGCAGGGAGA CAAGGTGACG GTGGTGAGAA C7CAGTTGAG 
AGTCAGAGAA TACAGTGCTA ATCAGGGTAG ATGAGCATGA CTTCCCCGTC T7CAGTCACC 
GGAGGAATGG TGGACGGCTC CGTGCTGGTG CGAATGGCCA CCAAGCCTCC CGTGAT . Go- 
CTTATAACAG TGCTCTTCCT CCTAGTCATA GGCGCCTGCG TCTACTG2TG CATTCGCGTG 
TTCCTGG2GG CTCGACTGTG GCGCGCCACC C2ACTAGGCA GG3CCACCGT G G C G TAT 2 AG 
GTCCTTCGCA CCCTGGGACC GCAGGCCGGG TCACATG2AC CGCCGACGGT GGGCATAGCT 
ACCCAGGAGC CCTACCGTAC AATATACATG CC AG ATT AG A ACGGGGTGTG TGCTATAATG 
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GATGGC7ATG GGGGGGCTGT AGATAATTGA GCGCTGTGCT 7TTATTGTGG GGATATGGGC 
TTGTACATGT GTCTATCATC GGTAGCCATA AAATGGGCCA TGACAACTGC CACAAGTAAG 
TCGTCCGACA TGTGGTTTTG CTTGGCGCTG TATGACTGCC CTCCATCCCT AAGCGGGACG 
CACTTGATCG CGCGGACCTG TTCTACCAGG TAGGTCACCG GGTCAAATGA TATTTTGATG 
GTGTTGGACA CCACCGTC7G GCTGGCGCTC AGGGTGCCGG AGTTGAGAGC GTAGATGAAT 1452C 
GTCTCAAACG CGGAGGATTT CTCGCCTCCC AACATGTAAA 7TGGCCACTG CAGGGCGCTG 14 5 6 0 
CTCTTGTCAG TATAGTGTAG AAAATGTATG GGGAGCGGGC ATATTTCGTT AAGGACGGT7 14 64 0 
GCAATGGCCA CCCCAGAATC TTGGCTGCTG TTGCCTTCGA CCGCCGCGTT CACGCGCTCA 14 70C 
ATTGTGGGGT GGAGCACAGC GATCGCCTTA ATCATCGTGC ATGCGCAGGA CGCTATCTCG 14 76 0 
TAAGCAGCTG C G C C AG TG AG GTCGCGCAGG AAGAAATG CT CCATGCCCAA TATGAGGGTT 14 82 0 
CTGGTGGGAG TCTGAGTACT CGTGACAACG GCGCCCACGC CAGTACCGGA CGCCTCCGTG 14 8 80 
TTGTTCGTAT ACGCGGGGTC GATGTAAACA AACAG CTGTT 7TCCAAGGCA CTTCTGAACC 
TGCTGGGCGG TGGTGTCTAC CCGACACATG TCAAACTGTG TCAGCGCTGC GTCACCCACC 
ACGCGGTAAA G C G TAG C ATT TGACGACGCT GCTCCCTCGC CCATTAGTTC GGTGTCGAAT 
GCCCCCTCCA TAAAGAGGTT GGTGGTGGTT TTGATGGATT CGTCGATGGT GATGTACGTC 
GGAATGTGCA GTCTGTAACA AGGACAGGAC ACTAGTGCGT CTTGCAGGTG GAAATCTTCG 
CGGTGGTCCG CACACACGTA ACTGACCACA TTCAGCATCT TTTCCTGGGC GTTCC7GAGG 
TTAAGCAGGA AACTCGTGGA GCGGTCTGAC GAGTTCACGG ATGATATAAA TATAAGCTTG 
GCGT77TT77 GAAGCATGAA ACCCAGAATA GCCGGCAGTG CAT C CTTTTT AA7AAAATTC 1536 C 
GCCTCGTCTA CG7AGAGCAG GTTAAAGGTC TGTCCCCGAA TGCTCTGGAG ACACGGAAAG 15420 
ACACAAAAGA GGGGGTCATA AGCGGCTAAC AGTAAAGGAG AGGAG3CGAA CAG7GCG7GG 154 60 
CTCTTGTTZT TGGGAATAAA AGGGGGCGTG 7G7GCCGA7C GTATGGGTGA GCCAG7GGA7 15 540 
CGTGGACATG TGGTGAATGA GAAAGATTTT GAGGAGTGTG AACAAT7T77 CAGTGAACGC 156DC 
CTTAGGG AG Z AAGTGGTCGC GGGGGTCAGG GCACTCGACG GCCTCGGTGT CGCTGACTC7 
CTATGTCACA AAACAGAAAG ACTCTGGCTG CTGATGGACC TGGTGGGCAC GGAGTGC7TT 
GCGAGGGTGT GCGGCCTAGA CACCGGTGCG AAAT3AAGAG TGTGGCGAG7 CCC7TAT3TC 
AGTTCCACGG CGTGTTTTGG CTGTACCAGT GTCGCGAGTG GGTGGCATAG CACGTGTGTG 
ATGGGGGCGC CGAATGCGTT CTCCTGCATA CGCGGGAGAG CGTCATCTGC GAACTAAGGG 
GTAACTGGAT GCTCGGCAAC ATTCAAGAGG GCCAGT77T7 AGGGCCGGTA CCGTA7CGGA 
CTTTGGATAA CCAGGTTGAC AGGGACGCAT ATCACGGGAT GGTAGCGTGT CTGAAACGGG 
ACATTGTGCG GTATTTGCAG ACATGGCCGG ACACCACCGT AATCGTGGAG GAAATAGCCC 
TGGGGGACGG CGTCACCGAC ACCATCTCGG CCA7TATAGA T3AAACATTC GGTGAGTGTC 1614 C 
TTCCCGTACT GGGGGAGGCC CAAGGCGGGT ACGCGATGGT GTGTAGCATG TATGTGCACG 
T7ATCG7CTC CA777ATTCG ACAAAAACGG TGTAGAACAG TA7GCTATTT AAATG CACAA 
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AGAATAAAAA G7ACGACTGC ATTGCCAAGC GGG7GCGGAC AAAATGGATG CGCATGCTA7 
CAACGAAAGA TACGTAGGTC C7CGCTGCCA CCGTTTGGCC CACGTGGTGC TGCCTAGGAC 
CTTTCTGCTG CATCACGCCA TACCCCTGGA GCCCGAGATC AT CTTTT CCA CCTACACCCG 
GTTCAGCCGG TCGCCAGGGT CATCCCGCCG GTTGGTGGTG TGTGGGAAAC GTG7CC7GCC 
AGGGGAGGAA AACCAACTTG CGTCT7CACC TTC7GGCTTG GCGC7TAGCC TGCC7C7G77 
TTCCCACGAT GGGAAC7TTC ATCCATTTGA CATCTCGGTA CT3CGCATTT CCTGCCCTGG 
TTCT _ rTT AGTCTTACTG TCAGATTTCT CTATCTATCT CTGGTGGTGG C7ATGGGGGC 
GGGACGGAA7 AATGCGCGGA G7CCGACCGT TGACGGGGTA TCGCCGCCAG AGGGCGCCG7 
AC-CCCACCC7 TTGGAGGAAC TGCAGAGGCT GGCGCGTGCT ACGCCGGACC CGGCAC7CAC 
CCGTGGACCG 77GCAG3TCC TGACCGGCCT TCTCCGCGCA GGGTCAGACG GAGACCGCGC 
CACTCACCAC ATGGCGCTCG AGGCTCCGGG AACCGTGCGT GGAGAAAGCC TAGACCCGCC 
TGT77CACAG AAGGGGCCAG CGCGCACACG CCACAGGCCA CCCCCCGTGC GACTGAGCTT 
CAACCCCG7C AATGCCGATG TACCCGCTAC CTGGCGAGAC GCCACTAACG TGTACTCGGG 
TGCTCCCTAC TATG7GTGTG T7TACGAACG CGGTGGCCGT CAGGAAGACG ACTGGCTGCC 
GATACCACTG AGCT7CCCAG AAGAGCCCGT GCCCCCGCCA CCGGGCTTAG TGTTCATGGA 
CGAC77G77C A7TAACACGA AGCAGTGCGA C777GTGGAC ACGCTAGAGG CCGCCTGTCG 
CACGCAAGGC 7ACACG7TGA GACAGCGCGT GCCTGTCGCC ATTCCTCGCG ACGCGGAAA7 
CGCAGACGCA G77AAA7CGC AC77T77AGA GGC37GCC7A GTGTTACGGG GGCTGGC77C 
GGAGGCTAC-T GCC7GGA7AA GAGC7GCCAC G7CCCCGCCC CTTGGCCGCC ACGCCTGC7G 
GATGGAC37G 77AGGA77A7 GGGAAAGCCG CCCCCACACT C7AGG77T33 A G 77 A CGCGG 
CGTAAAC7G7 GGCGGCA2GG ACGG7GACTG G77AGAGATT 77AAAACAGC CCGATGTGCA 
AAAGACAGTC AGCGGGAGTC 77G7GGCATG CG7GATCGTC ACACCCGCA7 7GGAAGCC7G 
GCTTG7G77A CC7GGGG377 77GC7A77AA AGGCCGC7A7 AGGGCGTCGA AGGAGGA7C7 
GG7G77CA77 CGAGGCCGC7 A7GGCTAGCC GGAGGCGCAA AC77CGGAA7 77CCTAAACA 17700 
AGGAA7GCA7 ATGGACTGT7 AACCCAATG7 CAGGGGACCA TATCAAGG7C TTTAACGCCT 17760 
GCACC7C7A7 C7CGCCGGTG TATGACCCTG AGC7GGTAAC CAGC7ACGCA C7GAGCGTGC 
CTGC7TACAA 7G7G7C7GTG GC7ATCTTGC 7GCATAAAG7 CATGGGACCG TGTGTGGCTG 
7GGGAA77AA CGGAGAAATG A7CA73TACG TCGTAAGCCA GTG7G7TTC7 G7GCGGCCCG 
7CGCGGGGCG CGATGG7A7G GCGCTCA7CT AC777GGACA G7T7CTGGAG GAAGCATCCG 
GACTGAGA77 7CCC7ACAT7 GCTCCGCCGC CG7CGCGCGA ACACGTACC7 GACC7GACCA 180S0 
GACAAGAA77 AG7TCA7ACC TCCCAGGTGG 7GCGCCGCGG CGACCTGACG AA77GCACTA 18120 
TGGG7CTGGA ATTCAGGAAT G7GAACCCTT 77GT7TGGC7 CGGGGGCGGA 7GGGTG7GGC 
7GC7G77C77 GGGCG7GGAC TACATGGCGT TCTGTCCGGG TGTCGACGGA A7GCCG7CG7 
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GACTCCG7GG ACACG7TAAT GTATTTCGTG GGTACTGTTC TGCGCAGTCG CCGGGTCTA7 
CTAACAT CTG TCCCTGTATC AAATCATGTG GGACCGGGAA TGGAGTGACT AGGGTCACTG 
G AAA C AG AAA TTTTCTGGGT CTTCTGTTCG ATCCCATTGT CCAGAGCAGG GTAACAGC7C 
TGAAGATAAC TAGCCACCCA ACCCCCACGC ACGTCGAGAA TGTGCTAACA GGAGTGCTCG 
ACGACGGCAC CTTGGTGCCG TCCGTCCAAG GCACCCTGGG TCCTCTTACG AATGTCTGAC 18600 
TACTTCAGCC GCTTGCTGAT ATATGAGTGT AAAAAACTTA AGGCCCTGGG CTTACGTTCT 18660 
TATTGAAGCA TGTTGCGCAC ATCAGCGAGC TGGACCGTCC TCCGGGTCGC GTGTAGATTA 18720 
TGGT7CCG7T CTCCTTCTTG ATGTTTAAAT TTTTGGGGGG GAACCACCGA CAAAGCGTCT 167BC 
TTATGATTTC CGCGAACACG GAGTTGGCTA CGTGC7TTTG GTGGGCTACG TACCCAATG7 18 84 0 
TAATGTTCTC TACGGATGCC AGTAGCATGC TGATGATCGC CACCACTATC CATGTCTTTC 
CGTGTCTCCT T G G T ATT AG G AATACGCTTG CCTTTTGCTT AAACGTCTGT AAAACACTGT 
TTGGAGTTTC AAATAAACCG AAGTACTGCT TAAACAATCC AAACAACTGG TGCGTCTTTT 
GTGGGGCCTT GATTGAAACC AAAAAGAAAA AAGTGTGCAT T A CT AG CTG C TGTTGGAAGG 
GCTCCAGCCA GTGCACCCCG GGAACGTAAC AGCCGTTCAG AAAGGACGAA AGGTTAACCA 
GAAAAGCCTG AAGTTCGCGG TAG A C AG AG C AGGCGTGCAG GGAGTCGTGT GTTTTTCTGG 
CCGCCTGGTA CTCGACCAGT TGATCGGCCG TGGAGACGTG CGCGTCCTCG CGCACACACC 
G CAT CTG C AA GTATGTTGAT AGGGACTCCA ATAGGCGCGG CTTTGCGGGG ACGTTGTCCT 
CGGACGGTCT GGGGGTTCCC ACGTCGGGAT TTGCTGACGT GGGCGTGGCG GGATGGTG CC 
GTGTGCAGTA TGTTTCCAGG ACCGAACTGT ATGAGTTTAT TCTG7GCACC ACGCCAATAA 
AAGGGTGCGC CA7CCG7GCC GTTTTGGGAC AG7G7CGCG7 GAATGTCGGG GCACTCAGTT 
CCCAcr - CT;: TCCGGCGTCT TTGGCGGTCT CCTGCAGGTT GGCGGCAAGG CGCTCCCTGT 
GACGGCTGAG CAGCATGTTT- GCTTTGAGCT CGCTCG7GTC CGAGGGTGAC CCGGAGGTGA 
CCAGTAGGTA CGTCAAGGGC GTACAACTTG CCCTGGACCT TAGCGAGAAC ACACCTGGAC 
AATTTAAGTT GATAGAAACT CCCCTGAACA GCTTCCTCTT GGTTTCCAAC G7GATGCCCG 19740 
AGGTCCAGCC AATCTGCAGT GGCCGGCCGG CCTTGCGGCC AGACTTTAGT AATCTCCACT 19800 
TGCCTAGACT GGAGAAGCTC CAGAGAGTCC TCGGGCAGGG TTTCGGGGCG GCGGGTGAGG 
AAATCGCACT GGACCCGTCT CACGTAGAAA CACACGAAAA GGGCCAGG7G TTCTACAACC 
ACTATGCTAC CGAGGAGTGG ACGT3GGC7T TGACTCTGAA TAAGGATGCG CTCCTTCGGG 
AGG CTGTAGA TGGCCTGTGT GACCCCGGAA CTTGGAAGGG TCTTCTTCCT GACGACCCCC 
TTCCGTTGCT ATGGCTGCTG TTCAACGGAC CCGCCTCTTT TTGTCGGGCC GACTGTTGCC 20100 
TGTACAAGCA GCACTGCGGT 7ACCCGGGCC CGGTGCTACT TCCAGGTCAC ATGTACGCTC 
CCAAACGGGA TCTTTTGTCG TTCGTTAATC AT3CCCTGAA GTACACCAAG TTTCTATACG 
GAG ATTTTT C CGGGACATGG GCGGCGGCTT GCCGCCCGCC ATTCG CTACT TCTCGGATAC 
AAAGGGTAGT GAGTCAGATG AAAATCATAG ATGCTTCCGA 



18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19380 
19440 
19500 
19560 
19620 
19680 



19860 
19920 
19980 
2004 0 



20160 
2 022 0 
20280 
20340 



PCT/US97/13346 

WO 98/04576 



153 

TCACATATAT CAGCAAAATA GCATAATTGC GGGTCAGGG3 ACCCACGTGG 
GTGGAATCCT ACTGTTGAGT GGAAAAGGGA CCCAGTATAT AACAGGCAAT GTTCAGACCC 
AAAGGTGT C Z AACTACGGGC GACTATCTAA TCATCCCATC G7ATGACATA Z CGG CGAT C A 
TCACCATGAT CAAGGAGAAT GGACTCAACC AACTCTAAAA GAGAGTTTAT TAAGTCGG" 
CTGGAGGCCA ACATCAACAG GAGGGCAGCT GTATCGCTAT TTGATCGT7T TGGGGGTAG Z 
AGCGCCGTGT TTGAGAAGCA GTTTCAGGAC GCACAGCATG CCGTCAGGGC CGACGGTGCA 
CTGAAGCG" AAGGCGAGCT CGGGACTCTG GTACGCAAGG CGGGCCAGAG GTTTGAGGCC- 
CTGAAAAGGC- AACGGTCAAT TTTGCGCCAG CCGCGCGACC TCCCACGGGT C3CCGACATT 
GACGCCCTGC- TCGACGCCGT CGCGGACCTC AAAGAAGAGG TGGCCGTGCG CCTAGATGCG 
CTGGAAGAGA ATGGAGAGGA GAC C CCCAGT CACTCCTCTT CGGAGATCAA GGACACAA77 
GTCAGGTG3A GGCTTGACGA TTTGCCCCCG GTGTGCCCTG AAACTCCCTA AGGCTACCCG 
GA7TTCAGAG AGACCCTGGG CGTCCACATG GCAGCTGAAT CAGCATATAC AGSTGTGCAA 
GAC7AAAAAG GCCACCGCGT ATCTTAAAGC GCCCCGTGAA TGGGGGCAG7 GCAC3CACCA 
GGATCCA3AC 7GG7CCAAGC GTCTGGGTCG TGGCGCCTTT GGGATAATCG 7CCC7AT77C =118 0 
C3AGGATCTG TGTGTGAA3C AGTTTGATAG CCGCCGGGAG TTTTTCTACG AGGCAA7TGC 2124 0 
CAACGACCTG ATGCAGGCCA CCCGAGAGAG GTACCCCATG CATTCTGGTG GA77TAGA7T 21300 
GCTAGGA772 G7GCAG2C77 GCATACCCTG TAGATC3ATT GTGTATCCTA GAAT3AAG7G 21350 
CAACCTGCTG CAGCTGGACT GGAGTCAGGT CAACCTGAGT G7CATGGCGG 2GGAG77-AC 21420 
CGGCC7AA7C- GC33CG3T3T CCTTTCTAAA CAGATACTG7 GGCATG37GC ACTGCGACGT 214 80 
tLtCCAGAC AA7AT7TTGG GGACAG3AGA C27AACGCCC A7GAA2C0C3 G3AGGCTGG7 21 54 0 
CC7TACC3AT 7TCGG7TC3G TTGCGCTACA CTCTGGGAGC AAGTGGACTA AC2TT37337 21600 
GACCTCTAAC CTGGGGTTTA A3CAACACTG CTACGACTTC AGGG7GCCAC 2CAAACTGAT 
~gtaag;at C7C7A7AAGC CGTC7TGCGT CCTC77GCAG T3TTACCTAT 2CAGT":GG 
-AAGATGCAC GCGCAGGTAT TGGACCAACC GTACCCTATC AGCC27AACA TGGGAC.^- 
GATGGAGATG TGGTGGTTGG GCTACACTCT G GTGACATG C CTGGAACTC7 AT CTGG AT G7 =1840 
GGCGCTAAAG AACCC7CTGA AGTTCTTGGG TTCAGCCACC AGAGAGGGA2 GGGGGGAACC 
CATGTACTAG TTGGGGTTGA T G ATT C G GAG GGTGGTGATG AGTGAGATG2 TGTGCGGTGT 
GTGGAGGATG ACGCTTGACC TGGGAGTAGA TTGCACCGGG AAAGGGGAG2- CGATTGGGAT 
GCGAGAGGAG CACGAGCTGG CGTTTCAGAA GCAGTGCTAT TTATATAAAG 22AAGGAAAA 
GGCAGAGTCG TTAGCGAACT GCTGGGATAA GGTAAACTGC G2GATGTTAA AGTG72TGGT 2214C 
TAGAAAGGTA CTAGAGCGAG AGTTTTTCAA GGATGGAGGC CAGGCCGACA 2GGGGGGAGT 222CC 
;tga AGACTATCTG GTTGACACCC tggatgggtt aagagtggat gaggaaga^g 
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ctgtggtggg aaggttgagg ttttgaaagt ttgtaaagga ggggaaggtt ggagagtggt 

GGGGAGAGGG GAAGATGGAA CGGAGGATGC CTGCGCTGCG GATGGGTTAG AAGTATTTG2 
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GTGTGTTTGG TGGTGTCAGG TTACTGGACG TGGCCAGCGT GTACGCCGCC TGTTCGCAAA = 250 C 

TGAACGCACA TCAGCGGCAC CACATCTGCT GTCTAGTGGA GAGGGCCACT AGTAGTCAGA 22S6C 

GTCTGAACCC CGTGTGGGAC GCCCTGCGAG ACGGAATTAT ATCTTCATCC AAGTTTCAC7 =2 62 0 

GGGCAGTTAA ACAACAGAAC ACTTCAAAAA AGA7ATTCAG CCCAT30CCT ATAACGAACA 22 68 0 

ACCACTTTGT CGCGGGCCCG CTTGCCTTTG GGCTGCGGTG CGAGGAGGTG GTGAAAACGT 22740 

TGCTG3CCAC CCTTTTGCAC CCGGACGAGA CAAATTGTCT CGATTATGGG TTTATG CAG A 2280C 

GTCCGCAAAA TGGAATATTT GGCGTGTCGC TGGATTTCGC GGCGAACGTC AAAACTGACA 22 86C 

CCGAGGGTCG TCTACAGTTT GACCCTAACT GTAAAGTGTA TGAAATAAAA TGCAGGTTCA 22 920 

AGTACACCTT TGCGAAAATG GAGTGTGACC CCA7ATACGC CGCGTATCAG CGGCTGTACG 22 980 

AGGCACCC3G AAAGCTGGCA CTGAAGGACT TCTTCTATAG CATTTCCAAG CCTGCGGTTG 23 040 

AGTACGTGGG ACTTGGAAAA CTGCCCAGTG AAT CTG ATTA CTTGGTGGCT TATGATCAGG 23100 

AATGGGAGGC GTGTC2TCGC AAAAAGAG G A AATTAACGCC CCTTCACAAT CT7ATTAGGG 23160 

AGTG7ATTTT GCACAACTCG ACCACGGAGT CTGACGTCTA CGTACTTACT GATCCTCAAG 23220 

ATACTCGGGG TCAAATCAGT ATTAAAG CCC GCTTCAAAGC CAACCTCTTC GTGAACGTCC 23280 

GTCACAGCTA CTTTTATCAG GTATTGCTGC AGAGTTCGAT CGTCGAGGAG TACATTGGCC 23340 

TAGATAGCGG CATTCCTCGC CTCGGATCAC CGAAATACTA CATCGCCACC GGCTTCTTCA 23400 

GAAAGCG3GG CTATCAGGAT CCTGTCAACT GTACCATCGG TGGCGATGCT TTAGACCC3C 2 346C 

ACGTGGAGAT TCCTACGCTG CTAATCGTAA CCCCCGTCTA CTTTCCCCGA GGCGCAAAGC 2352= 

ATCGTCTGC7 TCAC CAAGCT GCCAACTTTT GGTCAAGAAG TGCGAAGGAC A C CTTT C C AT 23580 

ATATCAAATG GGATTTCTCC TAT C TAT CTG CAAACGTCCC TCACAGCCCG TAGACGTGGA 2364 0 

CGGGGAACCG CTCGAC3TAG TCGTGGACTA TGACCCCATT CGC3TTTCAG AAAA3GGCAT 23700 

GTTGCTTGAG CAATC3CAAT CCCCATATCC CGCATTAAAA AAGAAGAAAA AAAATAAAGA 23760 

AG CAATTT AT TAAG2AAACA GTATGGTTTT CTGTACGTAT TTTATTCCGT G3TG3GTGAA 23820 

AAATAACGGG GGAT3GAGGA AGAGGGATGG GTTTATAATG CCAATATATC AGCTAAAT3A 23880 

ATJ , TC . TTTG cgtttcGTCG ATTTCACTGT CAC7TTCATG GTCGGACTGG TATT3G3TCC 23 940 

„.„"__, t-gg-G-GGG- T CTGG T CTTT GCTGGGAG3G 2400C 

TCGGGGCG33 CGTC3ATATG . ^ Cii .um-^u™- i^i^.- 

GCGGCG3TTT CTGGTGAACA GTCGGAGTTC TATCGACC3T CG3CGCCGAC GTCG - CAGAG 
GCATGTATGC CGCACTCGGC GTACAGAGTC CCCAGTCGCT C CTTATAACG CGTATAACGA 
TGGCTAGGAT GCACAGTATA GGGATACAGG AGATATTGAT AGCCACTATG TAGTGGAGAT 
TAGCCTGCAC GAA2GCGTTT TCATACCTGA TGACAGGCAG CA3TAGAATC AGATAACCCA 
CCAATACTCC CACGTAAAAG CCTACCTGCC GTCTCATAAA CTTTACCAG3 AAAAATTCCG 
TGTTTATGTA CCACACGACC GTCAAGGCTA G3AACATGTT CACCGCACCA AAAATGGC3T 
CTGACACGAG CACGTAAAAG CT3TTGCCAA CG3CCATCAT 3GTGCTCAAT GAAAACA3CA 
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G=ATTT==A GGCGGTTGTT GATAGGTACA GGTTGACGCA GACCGGTTTC CACCGAG7CA 
GCAG7GACTC CA7CA7GG7A T7A7CAGGTA CGTGCTGTTC CAG3AGAGGT A7TTCCCAC7 
GGGGGGAG77 ACATG7TATC AGTGACTGGA TG7GGGCAAA GGATATGCAA AAA7GAA73C 
AGTAGACAAA GGCTGCCATA AGTACGTGT7 TATATGACAG AACA7GGA7A AACA377G3A 
TGC7CCACA7 CCTTAAGATG GCGACATAAA GCACGC7ATG 7GA7CCAA37 AGCGCTA7CC 
AGGAT7G3A7 GC7CATCATG GTAGTGGCGT GAACATGCTT GGCC-GA7A7 ACG3CCA3CG 
CC3CGAGA3A GTAGTATACT ATGGCAATGC CGTCCACGAT AAAAGTCCAA AATATGTACA 
C-AGCATCTC TGGTTTGTCT AAAAACAGGG TCGGGGTGAG GTGGTTCGCT GAGT7GCGCA 
CCGTGAG377 7AGCGCGC7G TAGTTTACCA GATTGTTGAA GTAGGAGGGG AAAGCAA33C 
CGTCGTACGT GGCGGCCA7G GGCACGACTG CAGAGCAAAT GTACATAATT ACAGCCACAA 
ACAACAGC77 GACCCAGGAG GACATGAGAA AACGGTCGCT CT77GAAGCG CGCA7G777C 
T-CGGT=rTTT 7AAC777CGC CAG3CGGCGC 7GCGGCGGGA GAGCCAATCT GATGCCA™C- 
CCTATCGC3G 77GAC7777A AATACGCGCC CCGGGCAGAA GCCAGAGGTA 
7GAC7CAA7G GCAACGAGCG AAGAAACGGC GG-CGGTTAT GTCATCGGTG 
CAGCG773AC GTCCACTGCC GCA77A7TG7 CT3GCAGGTT AATTTTCTAC CCCTGGACCC 
AAACGACGG3 GAGACTGAA7 G37ACT77G7 GC-TGGACACG C7GACGAAAG AGGGGA7G3A 
GCGCATG3C3 GAAA7CCAGG AATGCGTCCC G7C7A7TACT GAACACGCCC G7GA3C7GG3 
GATCTGGGA3 77GGC3C7GC G ACTG 3AGAA TCA3AC3ATC GTCAAGG3C3 TCCGGACAGC 
GTCGCTTCCG G7GG77C7AA 7TA73AC7GT G33TCGCATA GTGAATGATG TGATTCC3T3 
C3CCAACG73 AGAACACCCA GACGACTAGC C737GCTTAC C7ACAC737G AGGCGACG37 
GACC7773AG GTCCCACTAA CCGGGCCC3C G333TCCACC G3AAC3TG3C ACAGC7CTA7 

, „, , -«t— g AAG AC C AG T 2 GAG SCAT ATA 2 5"?4C 

CTATAGGGAA TGTGCGATCT C^-~T-^ G«-«- — 

CTCC7GCZAG TCGAACGAGG CGCC7GAGGC CAAGAGGGAA AAGCGAGG77 7AGA7A7A72 25800 
AGATGTG77T G7CTG7C7CA C3TATGA7A7 C727A7CGCA GGG2GGGTCC 777C7C7GC7 25860 
GG7GCCCCAC GCGCCCGCT7 77CACG7CT7 A7GGA7CAA7 GAGGACAG 2 A AG7GGAACGG 
GGCAGCCG7C GAA77T7TCA GAGCCCTACA CCA7AAGC7G 77CAGTGAA7 GCAA7GG7A7 
AC7CCC727G T3G77G7ACG 7G77C2CGGG AGC7GTGGAA GAGGGCA7AG C2777GC3C2 
A7TACT7C7C GCAT7CCCTT GCA7A2777T G7GG7A7GGG 7CG7CTAC77 "TCTGGAGAG 
GGCG7CCG7G CAGTGGGACC 7A777GAACC G2ACA7CC7G ACCCAC7T7G A23GGA7AAA 26X60 
GCGAAC777T TTGGCAGATA CAG7G777GG G7ACGAC7CC CTGGGGAT7T 7AAGGGAA7 j 
TGAAGATCAG 7ATG7G7GGC C2ACGCC7GT CACTGACA77 AA7A77AA77 7G7GCACGGA 
TAGTGACACT ATGGCCA7CG 77AGAGAACC A7CCGGTCTG G7GGCCG7GA A727AGAAG2 
CCTG7TGZGC ACCGACTCCG 7AT7A7CGCG GG727CGTCC A77G7C7CA7 

t-CCACCCGGG AG7GC2GTAG GAGCGTGGAG Z77AGA7A2A A7 
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GTCGACTG7A TTATCATGGT CCACCTCTAG GGGTCACAAA TGGGCCGCAA TCGTGAAGTG 
GAAGTTATTT TTCCTCGTCC AAGCTTTGGA GCCTGAGGTG AGACCTACTG TCCCTGCT7G 
AAGCGGAGAG GGGGTGGTGC GAGTTGGCAG TTGACGGGTT TGTGATAGC7 GGAGTGCTGA 
CCACGGCACA GGACCCATTA ACTTTCCTAT G TG TTTATTT TTAGCAATGG TCTCCAGAA7 
TCAAGGATCT CAAAAGGGCC TGCCAGATGG CCGGGTTTAC TCTGAAGGGG GGGACTTCGG 
GGGATCTTGT ATTCTCATCG CATGCGAACT TGCTCTTTTC AACCTCGATG GGATATTTCC 
TCCATGCAGG CAGTCCAAGG TCGACAGCGG GGACGGGGGG TGAGCCTAAC CCACGTCACA 
TCACCGGA" AGACACTGAG GGAAATGGGG AACACAGAAA CTCCCCCAAC CTCTGCGGC7 
TTGTTACCTG GCTGCAAAGC TTAACCACAT GCATTGAACG AGCCCTAAAC ATGCCTCCCG 
ACACTTCCTG GCTGCAGCTG ATAGAGGAAG TG AT AC C C CT GTATTTTCAT AGGCGAAGAC 
AAA CATC ATT CTGGCTCATC CCCCTATCGC ACTGTGAAGG GATCCCAGTA TGCCCCCC7T 
TACCATTTGA CTGCCTAGCA CCAAGGCTGT TTATAGTAAC AAAGTCCGGA CCCATGTGTT 
ACCGGGCAGG CTTTTCGCTT CCTGTGGATG TTAATT AC CT G TT CTATTT A GAG C AG A C T C 
TGAAAGCTGT CCGGCAAGTT AGCCCACAGG AACACAACCC CCAAGACGCA AAGGAAATGA 
CTCTACAGCT AGAGGCCTGG ACCAGGCTTT TAT CTTTATT TTGAAAAAAG GGAAACAATG 
GGGGGTTTGA AAAGGGTGCA CATTTTCAGA TATTTTAAAA CTTCATTGTT CT CC AGGTG C 
TTGGTAAAGA T G G TAT C AC A ATAAAAAATG TTTACTGGGT CCGCGCAGGT TTGTTTGTCA 
T _ TC . TTCT CTCCACTAGA CTCCAGTTTA AAAGACTCTA GATAAATGGG TTTCATTAGT 
CCCCCCATGG GGGTTGAAG C GTCGCCTATC GCCTTATGAA GCTTAAACAT AACGAGTGGG 
GTGGCCCTGA AATGATCGTC CACGGACAGC TCGTAAACAA AGGCGGCCGT GGCAGTCAAC 
GTCTCTATAC CGTGCATGAC GAAGGCCGCG TCCATCCCCG GCGTCCTCTC ATGTGTCTTT 
CTGGCGCGAC AAATAATAGA TCTCAAAAAC GTTGGTGACA TGTCTCGACA GTTCTCGAGC 2778C 
ATCGATAACA GGCAGCAGAG CTCGGTTATG CCGGGAGATG TAGGTCTAAG GAGGCACAC7 =7840 
CGCTCTTGGA ACACGTGAGG GTGTAGGTCT ATGTGGGTCA CCATGTCTTC GTGCTCCACC 
AGG C AC AC C A CCGTAAATCC CACAAAGTTG GGCGAGGACA GGCGAGATTT CACGTGCTCC 
CTGAGACACG CTATATCTAA GTGGCCCATC ACGGACATTT TGGGGGTATT GCTTCCAACC 
AGTGCGTTGT TTTTCCTATG CACTTCCAGG ACAAGGCGGG GCACCACAGG GTGGGGGTAT 
ACGGGACAGG CCTCTTCTGA CTCGCGAGTC TTCGGGGCAT GAGTACTCAT TGGCACTCCA 
GTCAGTCTCG CCAGGGCCCT TTCCAGGGAC ATTCTCGAAG GGTGGTGTAA C7AGACAGTA 
TTTCTGTCCC ACGTCGGTTA TATACACAAA GAGTCTGCTA GT CTG AT AT A AATAGGCCGC 
GATGTCCTGC AAGCTGGAGG ATACGAAGGA GTGACTAATG AGCTCCATCT GAAGCAGGTC 
CGCGATCACA TACGTGAATG GACCAAGCAG GATGGATATG GTGTCCTGAG AATAGGTGAC 
GCTGAGCCGC TGCCCTTGGT TGTCAACAAC GGGAGCCAGC TTGTAGGTTT GAAACATCTC 
GCT _„_ C AGGTTCGTGA GATCTTTCAT GCTTTCTCTC ACTGGGGGTA TGTAAGAAGA 
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r*v. — "^^^GA TGGGATATGG GAAGACGTTA GCTGCAGA3A 
GAAAAAGCTA TTTAG CACGG CA..o— «» luJOftJAlbu 

GGGGTCCTGT AAACGTCCCA GAGATTGAAA TGTGTTGGCG G7CAGCAGA7 TCACACTG"2 
GGGACCCTTT GCGTCACCGG GCTGTTGGTG TGACAGCTGT GTCTCAATAC ATTTTAGCC7 
CTTCATGCAG AGCTCCCTCT CCTTTTCAAG TTGAGTTATT GTGTCAAATT GTTCGTTTAT 
CTGGTTGGTG AGACACTTGA AAACGCTGTT GGACACCTGG CGCCTGAGCC CCTGAGTGGT 
CGTCTCTTGG CCTGTGCCGA AT AG TTT ATT CTTGTCTACT ATGTTTTGGG ACACGTCGG7 
GACAAAGTCC TCCACGACGT CGGTGACACC GCTCACTGTC TTGTTTT CTG C CAGTTTCA7 
GA3CAGGTTG AGGAGCTCTC GCTTG3GGTC TGTTCTCTGA GAGGCCTGCT CCAGGTGGGT 
CATGATGTCT TTGTACACAT TGTTACAGGC GCTTCCAACG AGGGCCTTGG TGGGGGCTGT 
GTTCAGGAGC TGGCAAAGTT TTGCGTGCTC TGCCGTCCGG TGACAGCTCA TAATGCTGGT 
ATA CAT C CT C TGAATGGGGC TGTCAAAGAT CACCCGCCCA GCGAAGATGG CGGGCATAGT 
AATCACCTCC ACATGAACCC TTTTCTGCTT ATACAATCCC ACGAAAGTGT TTTTAACACA 
GTCATAGTC7 ATGCTCACCT CT G AG TAG C C CGGAATATAG AGGGCGCTTA AACTAGACAC 
CAGGTTGCTA ATCTCCTGAG TCACGCTGGT GAGTATCCGG CCTATGGTTT 7TTCACCAGA 
GGCCAGACGC TGGCAATCTT T CATC AG CTG TTCCTGGATA GAGTTAACCA GCTTGTGGTC 
GGGTGTGTGC TTGACGACTG GTACCATTCC TACCGTGACC ACCCAGTCTA CGTATCTCTC 
ATACGAGAGC TGTGTCTTGG CGTAGAGGAC CCGGTTGATG GCATTGAGAA GCAGGTGGTC 
TAATGTCATG CGCATAGTCT GGGCCCAGGA GTCGAAGGTT GACCTTCTGT AAGACCCCCA 
CTGTGCTTCC TTTTCTGGCC ACCTGGTTTT TGCTGAGGAC TCGTATGTCC TCCAGTCGGA 2964C 
CAAGACGTGG TCGTAGCTAC AGTTGGCCAA TGCATTCTTG TACAGGTGGA TAAATAGCTG 29700 
TCTGAAAAAA ACACCCGGGT 7TCGCAGGCT GCAGTGTAGA GTCTGACCTC TGACATAAGA 
ATACTTGCCT TGCAGGATCT. CAAAGAGGGA GATGGACAGC TCGGAAGGGT GCACTGATAT 
GGACGAGCCC AGCCCCGGGT TCATCCTCAA CAT G A CAT C G GATGCCAAAG 7GAG3AGCGT 
AGTGGAACAG ATTGACAGGT TGTCAAATAT CACTACCTCG CCCCCGGAGA TG33CTGGTA 
TGACCTAGAG TTCGATCCAC TGGAAGACGA AGGCCCCTTT CTGCCGTTT7 CGGCATACGT 
AATAACGGGG ACTGCAGGAG CG3GGAAAAG CACCAGCGTA TCCGCCCTAC AT C AG AAT CT 
CAACTGGCTA ATTACGGGGG CTACAGTGGT AGCGGCACAG AATCTTTCCA GGGCTTTAAA 
GTCCTACTGT CCCACTATAT ACCACGCCTT CGGATTCAAG AG C AG AC AC A T7AATATCTG 
CCAGAGGAAA GTGCCCAAGG TAACTCAGTC CTCGATCGAG CAACTCCAGA GATACGAGCT 30240 
GGCTAGGTAC TGGCCAACTG T C AC CG AT AT TATTCGAGAA TTTATGCGCA AGAAACAAAA 30300 

^^r— ■ — r kGACTC ■ TGCCGTA 7GGGTG3A3C 30360 

GGGGCAGTAT AGCTCCCTCT C-~~«G^>- .T.^ACT. — -u- 

TCT CGTC C CAT AT 3 0420 
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2S6&: 
2 5 7 4 C 
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29760 
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30060 
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CAATTTGTGG ACGAGTAACA TTATCGTGAT AGACGAAGCT GGAACCC 
TTTGACGGCC GTGGTGTTCT TCTATTGGTT TTACAACAGT TGGCTGGACA CGCCGCTAw 
CAGAAATGGT GCGGTGCCTT GCATAGTCTG CGTGGGGTCT CCGACCCA3A CGGACGCCT* 
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TCAGTCGGTC TTC AAC C ACA CGCAGCAGAG AAAC GAG ATA TCTGCCTGTG ATAATGTGC7 
CACCTTCCTA TTGGGAAAAC GTGAGGTTGC AGATTATATT AGGCTGSAC3 AGAATTGGGC 
CCTATTTATA AACAATAAGC G CTGTACGGA TCCCCAGTTT GGTCACTTGC TGAAGACCTT 
AGAATATAAT CTAGACATAT CACCAGAGTT AATGGACTAT ATAGATAGGT TTGTGGTTCC 
GAAGAGTAAG ATTCTGGACC CGCTCGAGTA TGCAGGGTGG ACAAGACTCT TCATCTCACA 
CCAGGAGGTG AAGT CTTTTC TGGCAACGCT GCACACCTGC CTGTCGAGTA ATAAGGATGC 
TGTGTCCACA AAG CTTTTC A CCTGCCCAGT GGTCTGTGAG GTGTTTACAG AGCCATTTGA 
GGAGTACAAA CGGGCGGTAG GCCTCACACA CATGACTCCC ATAGAATGGG TAACAAAAAA 
TCTTTTC AG G CTAAGTAACT ACTCGCAGTT TGCTGATCAG GACATGGCTG TGGTTGGGAC 
CTATATCACA GACGCGTCCA C AC AG AT C AC CTTCGCCACT AAATTTGTCA AAAACAGCTA 
TGCTACCCTT ACTGGAAAGA CCAAAAAATG TATATGCGGG TTTCACGGGT CATACCAAAG 
ATTCAAGTCC ATCCTAGACG GGGAGCTATT TATCGAAAGT CATTCGCACG ATAACCCCGC 
TTATGTGTAC AGTT7CCTTA GTACCCTGCT ATATAATGCC ATGTACT CAT TTTACGCGCA 
CGGGGTGAAG CAGGGGCATG AAGAATTCCT CAGGGACCTC AGGGAACTGC CGGTGTCTCA 
AGAGCTGATC TCTGAGATGA GCTCCGAGGA CGTTCTGGGG CAGGAGGGGG ACACAGATGC 
CTTCTACCTC ACCGCCAGCC TCCCACCATC CCCCACCCAC GCGGCTCTTC CAACACTGGT 
GG C CT ATT A C TCCGGGGCCA AGGAACTATT CTGCAACAGG CTGGCCCTGG CACGCCGACA 
CTTTGGTGAC GAGTTCCTCC ACTCCGATTT TTCAACGTTT ACGGTGAACA TCGTGGTGCG 
AGATGGC3TG GACTTTGTGT CCACTTCCCC CGGGCTCCAC GGTCTAGTGG CATACGCATC 
CACTATAGAC ACCTATATAA T C C AG G GAT A TACGTTCCTC CCAGTGAGAT TCGGCCGTCC 
AGGAGGACAG CGCCTCAGCG AGGACCTGCG CAGAAAGATG CCCTCCATAG TT3TCCA3GA 
CTCATCGGG3 TTCATTGCCT GCCTGGAAAA TAACGTCACC AAGATGACAG A3ACCCTC3A 
AGGTGGCGAC GTGTTTAACA TATGTTGTGC AGGGGACTAC G G TAT C AGTT CTAATCT3GC 
TATGAC CAT A GTGAAGGCAC AGGGGGTTTC A CTAAG TAG G G TGG C CAT AT C3TTCGGCAA 
CCACCGCAAT ATCAGAGCCA GTCTAGTGTA TGTGGGTGTA TCCAGGGCCA TCGACGCTCG 
TTACCTGGTA AT G G A C AG T A ATCCCCTTAA GCTAATGGAC CGCGGTGACG CCCAGTCCCC 
ATCCTCAAAG T AC AT CATC A AAG C C CT AT G CAACCCCAAG ACTACTCTGA TCTACTGACC 32160 
CGTACCCCTC T CTTAGG AC A CTGATGTGTT TGG 3 AAT AAA GCATGAGACT TG AC AC CTAT 3222Q 
AATGGTCTGT ATTGACACCA TTCTTTTATT TATCAGTCCA GCCACGGCCA GTTATATGCA 
CCGTTTCCAC ACAGGGGTGG CGTGGAGGCC AGGATGCGGG TTGGGTCGCT GCACCTGGAC 
CCCGCGGTAG TTGTGCTTCC TGATGAAATC GAGTGGGCGG AAGTAC7GGG A3ATTGG3TT 
GGGAGGTGAC CCTTTGTGCT CGACGGAGAC ACGATCACGC TCACGGCGGA C3AGGGCTCC 
TCGTCTGTGT CACTCCCCGA G G AT AT AATT ATCACGGACG CCACTGCTTT GCGGCTTAAG 
TTTGGTTGT C TCTGGCAGCG CACCACATCC TCGCTACCAG AGGAG3C33T AGACTGCCTT 
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2264 Z 
22101 
221 6 C 
22S2C 
22B6C 
3 2 940 
3300C 
3306C 
332.2C 
331SG 
23240 
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TTGCGCTTCT GGCCCACGTC CATGAGCCCG ATTCTCTGAC TCAATACTTC CCCTTGGTC7 
TCTCCGTCCT CCTCGGACGA GGGTGGCTGG TGGGAAAAAT GGCGCGCGTC GGTAAACGCG 
GCCTCATTGT TCACGTCCGG AGAGTTGGAA CTGTCATCGC TATCAGAGTC CGATGTCAGG 
TCGACGATCG CGGTGGGTGC GGCGCGCAGG GGGCGCCACG AGGGCCCTTC ATCAGGGTCG 
CTGTATGGTC- AACTTTGTGT TCCAGGTACA CTA7TTCTGG AAGCAGGTGA AAGTCCGTA7 
GCCCCGGTCC CAGTGTATGC CGCCATCGGT TCCAGGATAG CAACCCCCTC GTCGTCTGAA 
GGTGAGAGCC CAGCAGGGGA AAATCCGTCA TCCTGACTAA CCCATCCCAT GGACGCCTCG 
GACTCCGCCG TGTCCGTTGA ACTGCGCACG CGGCCCGCTA CCACTGC7AC CGGTTTGGGC 
GTATGGGCC C GTCTGGCCAG AGGCCTCGGG CGCAAGTGAG ATAAAGGTTG AAAAAAGTCT 
GCAGGGTACC CCTCTGGCTC GTCTTCCTCC TGAACATCGT CATTTTCTTC TTCATCTTCA 
TCTTCCTCA7 CCTC3TCATA TTCAGATTCG CCGCTCGACT GATCCGGGGA TATCTG TAG A 
TCCAGAGGGG TTGCTGGCGG CGATGGCGTG TCCTCGGCGA AGACGTCGTC TGGGGCAGAC 
AT AT C TAT C A CGGTGGGTGC AG CAT AG C C G CGCGGCCTGC CAAATCCTGG AAGTGATGAA 3 3 360 
AGAGGTGGAG GTGGGAATAT GAACTTCACG GGGGGTCGTC TGCGAGGCGC TCCTTCAATT 
GGAAGCATTC TCTCTTCATC GTGTGTGCTA GACGAGGTCC TCACAAACAT CGCCATGGCC 
TTGTACG3GG TTGACCGCTA GGGGCGGAAA TTTACAAAGC ACACGAGTTA TTGCCTTTAC 
TGCTCCAACA GGCCCCAGTC CACAGTCTCA CGCCGGTGGC GAGTCAAATA GTCGTTGGCT 
AGGTTAAAG7 G ATT A C AG CC CTGGAACCGA GGCCATCGCG AGTGTCGGCC ACCAAGAGAG 
GCCAGCGGAG ATGGATGCTG GGCCGTAAGC ACCAGGTCTT TCTGTGCGTT TATGAGCGGA 
GTTCTGTCAA TGGCCTTGCG CCCCCACAGG AG AAAAACG C AATGTTCTAA CTTTGAGGAT 
ATGCTACTGA TGATGAAACT CGTGAACCAA TCCCAGCCAA GTCCCTCGTG TGAGCCGGCC 
CTCCCCTTCT CCAC CGTCAA AACTGTGTTT AGTAGCAACA CACCCTG3CG AGCCCAGCTG 
TCGAGGCACC C GTG G 3 AAGG AGTACTGAAA TTGGGGACGG AAGCCTCTAG CTCTCTAAAG 
ATGCTTCTCA AACTG3GTGG AACCTGACAT TGCGGATCCA CACTAAACGC CAGGCCAGTA 
GCTTGGCCCT TGTGGTACGG GTCCTGGCCT AAGATCACCA CTTTAATATC CTCTGGATCG 
CAGCAGTGGG ACCACCACAT CAGCTTGTCC TGTGGGGGAT ACACTGTGGT GGTTAGCCTA 
AGTTCCCGAA TCT3TCTGAG CAGCGAGAGC AGTTTCTGTT T C AGAAATGA TGAGAGGCTC 
AGAAAGGAAA TCCACTTAGG TGCCAGTAAC AGATCCCGGT CGTCCACCCC CTGACTGATG 
GATAGGGTGC CCCTAAAGAC CGTCTGTTGC AACCATGCGT CCATGTTGAA CTTATTTTCC 
C „ TGA::CT gcgtgCGCTC TCCGGCTGCT GCTTTTAGCC CGAGTCTGAC TTCCGCTAAC 
AGAAC CTGT C CGGTTCATGG CCTTTCCCAC GCTTATTATA ATTATGTTTA CGTTGTGAAT 
AGAGCTATCT GCAGTGGTCG CGTTAAAACC TACAGTATAG GCCGTCAAAC TTCGTTGTAA 
ATACCACAAC AACCTCAGGT TTTCCTGCGA CGCCCAGGAC CCCAATCTTC GAACGACCGC 
GACTAAAAAT GAC CT C AG AT TAAACCCATT CACGCATGTT TCCACGGTAA TGTCGCCTGT 
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TTTGCTTCGC AG CTTGG CTA TACAGACCCC GTTGCAGTGA 77CGGATCGG CGAAGTGGAT 
AGAGTGGACC G CAAAG AACA ACGGCAGGGT AGAGGCTGCC GATG C CTG AA TTGCGCAACA 
TGGTAAGGCG ACGTATGCGT GAGATGTGAC CAATAGGGTG GTCCACAGGA CGGCAAATAG 
CGCAAAGATC CCCATGGGGC AAATCCGGGT TTCACCCTTG TGTTGCCTGG 7TCGGTGCTC 
CCCAGGGAGC CCCCTTCCGT AATATCTGTT TTATATAGTG AGGGTTCACG CATGCGCGAG 
-CCCGACTAA TGAGGACAAT TACTGAAATT GACCTTTTCG CGACACGGGC- GTGAGGTCTA 
mCCCACGA CATACTTCCG CGGAAAAATA CCCACGCTCC TTAATTTCCG TGGGAAGACG 3 5 040 
ATGGGGGAAA TGTGGCAT7A CCTGACACGG TTT C AAT CAT ACTCATCGTC GGAG CTGTCA 

I N FO RKAT I OK FOR SEQ ID NO: 19: 

I SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 35100 base pairs 
;B) TYPE: nucleic acid 
(CJ STRANDEDNESS : double 
( d } TOPOLOGY: linear 



345 = 1 
34~4C- 

34 eo: 

3 4 66 0 
34920 
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35100 



(li : 



MOLECULE TYPE : DKA (genomic; 



txi' SEQUENCE DESCRIPTION: SEQ ID NO:lS: 
CACGTCTGGC TGAGATTTTC TAAAAAGTCA TCCAATGAAT CATCGGAATC ATCAG CACAC 
TCTAGAACTA- CTCCATATGC CGGGGTGCGC GGGGGTCCCG AGTAGTGCAC GTCGCCATCG 
G GAG A C A C AG ATGATGGGTT TGAAATGTCC ATACGGGCCG TGTGCACAAG GGTCACGTCC 
CCATCCCCAA CACAAGGACC TTTAGATACC CTCTCCCGGC ATGTGCGCGT ATCCGGGCAA 
GCAAGCTG37 GTTCTGGATT CCAAACGTGC CCAGCGGTAC CCAAAATCGC CAGGGCGTGT 
TTTATTATTT CCACAGGAAC CGGTTTCTCT AATTGCATCA CCAGGGTATC CAAAAGCCGG 
GCTTCCACGT TGATCCGGCT TACCGACAGT TCTTTCCAGG GTTTCCTGGT GGGGCGCGGC 
AGCTGACTCA AAAAGGTCAC TGCCTCTGCC CATGGGCGGG TGGGTGACAG TCCGCCATAC 
TCTTCCAGGA CACTGGCCAT GCATGACTCC AACCGTCTCA C3TCCGAGGT AATGTGCTCT 
ATGAAGATGT G GT AG AG C C A GCAGACGTTC AAAIACGATG AAATCAAGCT AAGCTCCCGC 
CGGAACTCCA CATCCACAAA GGGGTATTGC TCCGGTGTCT G T ATT AG G T C TGGAATAGAA 660 
AACT C AG AAA AAGACACTGA CCCACCAAGG AGAACCTGGC GTCTTGCAAA GTTGATGAGC "2 0 

CCCG CAGAAA GAATGTGTCT CCCGTGGGAC AAAG AG CTTG GGGGGGCAGA GATGGCGCTA 
CAGTGGGTGA TTTCTTCTAC CACGGTCATA CAT7GGTGGC ACCCACAGGC CTGTTCCAGT 
ATCAGCATAA ATCTATCTTT GCAGTCATCC CAGATCAAAG T CAT G T C AG A TG CTGTTGCC 
TGGCATTTTG CCCGCATGTA CATTTCCTGT CCCACATATT TTAACATCTG TAATACTGGA 96 0 

AG T AG ATT C A GTCTGGTGTT GAGCCCCCCC GGGGAAGCCA GCGTATGCTT CAGGACCACC 
AG3GACGCTA AGAACCCCGG GTGTCCGCGC TCCGGAAACA GACCTCTGAG AATACGCTCG 
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G7CTTGACGA AACCCGATGT GGTACCGAAT GCCACAATCT GTGCCCTCCA GCTC7CAGAA 
7TTTCA7CTC CAATACCCGG AATTGGGATA CACACCTCCA TGTTCAG7CA CATGTACGCT 
AGSGTCTCCC CACCCAACCC C C AT AGG AC C CAGCTACAGC TTATCCTCCA CTAAATA7CA 
GGCAGCTACC GGCGACTCAT TAAGCCCCGC CCAGAAACCA GTAGCTGGGT GGCAATGACA 
CGTCCCCTTT AAAAAGTCAA CCTTACTCCG CAAGGGGTAG TCTGTTGTGA GAATACTGT7 
CAGGCAGCCA CAAAAATGGC GCAAGATGAC AAGGTAAAGA TCG AC CTTT7 TATTGTA7AC 
TGAACAA7GC GTGTTTACAA TGGTGTAGGT G GG AG C AG AG TTCGCCAAGC TCTACGTCCG 
AACAGTCGGG 7GTCAGGGC7 CT7ATTAAGT GTTCGGTGTA CTTGACCAAA GCCGCGGAAC 
CTAGGT7GGG 7C7G7ACAGG TCGTACCAGG CAAAAAAGGA TCGGGCGGTG CTTTTCAGGA 
GAGTTAGGGA CGTGCTGATT ATGTGGACAA GCTTCTGCTC GTAAATG CAC CGCTGGTACA 
7CTGAACGAC AGCTGTCCAA AAAAAACAAA GGTTCAGCTG CACGTTAAAA TCTGTATCCT 
GAAAGTCrrG GTAAATGACA GTTTCTACCA AGAAAAACTT TTTTACCACG ctggccatci 
ACTGAAAGGA GGGAGCACAC G7CCCGTTGT GCGTTGTTAG GAT AT C C CT A ACTTCGGAGC 
GGAGACGGC7 GGACGCTCCC ACAAAATGGG AGAGGCACCA CTCTGTGCAG TCCGCGGTC7 
GGGGTTCTGA TTCCAGGGGC GCCGTGTGGG GGTATTGGAG AGTCAAAACT C7GGGCAG7C 
C77TAATGAG CTCTCTCTCA AAACCTATGC AGCCAGCGTC CACTAGTGGC AG7A7GCCG7 
7AA7AACAC2 CC77A7C77G TCGTTGCCAA GTTTGTACAA CTGCTGCAGG GAATAAGCCA 
AATTCG CC 77 AGCCGCGGGA ACCAGGTACG GCTCGCTTTG TCGGTGCTGG ACCAATATCT 
GAA7GG7777 7GCAAGG7A7 AGGG7C77C7 CAACGTTTAG AGCGGGTACG TG3CAGTCTG 
GATTGAGGGT GGCGA7GGAC AGGGTATCTA ACTCCTGAAG TAT7TGATCC CAGGACGGGT 
AATGATAC27 AAA7AGA7GG TTGAACAGGT GAT7TTTAAG GGGCC7TC7C GATG7CAT7G 
7AAAAA77A7 GACACGCCAC 7C7C7CCT7A GGG7AAGAAG C7TCGGCGG7 C77G7GTGGA 
AAGCTTCGTC GGC2777CGG ACGAACTGAA GGCCCAACTC TACCAGTGTG 7GCTCC7TAT 
AAATGACGCA TACGAAACAA 7CTACGATCC CAGTGACC7A AA7AGAGTGG 7GGAAGA7G7 
GTGCATTCGG A7TA7GAAAG AATGTTCCAA GCTTGGTGCG CTATGTGG7C TGTTTACAGA 
CATTAACATG TTTAAC7TT7 7C7GC77TT7 7CGTGCC7C7 CGAATGAGGA C2AAAGGCGC 
GGCCGGG7A7 AACG7GCCAT G7G2AGAGGC ATCCCAAGGC A7TA77CGGA TCCTCACGGA 
GAGGATC77A TTCTGCACAG AAAAGGCA7T TCTGACAGCC G C ATG C AG Z G GGGTGAGCCT 
GCCTCCAG7C ATA7G7AAGC 7AC7ACACGA AATATACACT GAAATGAAGG CCAAATGCCT 
GGGGGCCTGS AGGCGAC7CG 7CTGCAATCG GAG3CCCATT AT GAT AT 7 AA CCTC77CCC7 
AC7GAAGC7C TACAACACGT ACGATACCGC CGGGC7GCTC 7CTGAGCAG7 C2AGGG7CC7 
C7GCC7777G G77772CAAC CGGTCTACCT TCCGAGGAT7 ATGGCGCCGC 7 G G AG AT CAT 
GACCAAGGGT CAGCTCGCCC CTGAAAACT7 TTA7AGCATC ACCGGTTCTG C7GAGAAACG 
CCGGZCAAT7 ACCACCGGCA AGGTCACTGG ACTGTCCTA7 C2AGGAAG7G G7CTCA7GCC 
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AGAATCTTTA ATTTTGCCAA TCCTGGAGCC AGGACTGTTG CCGGCTTCCA TGGTAGACCT 
CAGCGATGTG CTGGCAAAAC CCGCCGTTAT TCTGAGCGCC CCTGCCCTGA GCCAGTTTGT 
C ATT AG C AAA CCCCATCCCA ACATGCCGCA CACCGTCAGC ATCATCCCC7 TTAACCCATC 3 3 0: 

GGGTACAGAC CCGGCGTTTA TTAGTACGTG GCAGGCCGCG TCACAGAATA TGGTGTACAA 3 3 5C 

CACATCC , CC gcgccCTTAA AACCGGCCAC CGGTAGTTCA CAGACGGTGT CAGTCAAGGC 
GGTTGCTCAA GGGGCCGTGA TTACTGCGAC AACGGTGCCG CAGGCAATGC CAGCGCGGGG 
TACCGGAGGG GAGTTGCCTG TAATGTCAGC GTCCACTCCT GCAAGAGATC AGGTCGCTGC 
ATGTTT7GTC GCAGAGAACA CCGGAGATTC TCCCGACAAC CCGAGCTCTT TCCTGACGTC 360C 
ATGTCACCCT TGCGATCCGA ACACGGTTAT AGTGGCCCAG CAATTTCAAC CACCGCAATG 36 6 0 

CGTTACGTTG TTGCAGGTTA CCTGTGCCCC CTCTTCGACA CCACCCCCCG ATTCAACAGT 
CCGGGCCCCG GTGGTGCAGT TGCCAACAGT AGTCCCTCTG CCGGCCAGCG CGTTCCTCCC 
GGCGCTCGCC CAACCAGAAG CCTCGGGCGA AGAGCTTCCG GGCGGTCATG ACGGAGACCA 
AGGTGTGCCG TG T AG AG ATT CAACGGCGGC GGCTACGGCG GCAGAGGCGA CAACACCCAA 
ACGAAAGCAG AGAAGCAAAG AGAGGAGCTC AAAGAAGCGT AAGGCTTTGA CCGTGCCAGA 
AGCCGACACC ACGCCATCGA CCACGACACC TG GT AC CT CT TTGGGATCAA TTACCACCCC 
CCAGGATGTG CACGCCACGG ATGTCGCCAC GTCTGAGGGA CCATCGGAGG CACAACCCCC 
GCTACTGTCG TTACCCCCGC CACTGGACGT AG AT C AG AG T CTATTCGCCC TGTTAGACGA 
AGCGGGCCCT GAAAGATGGG ATGTCGGGTC GCCTCTCTCC CCCACTGACG ACGCGCTGTT 
GTCCAGTAT7 CTGCAAGGAC TGTACCAGCT G3ACACGCCA CCGCCTCT3C GGTCACCGTC 
CCCCGCTTGC TTCGGCCCGG AGTCTCCGGC GGATATACCG TCACC7TCTG GTGGAGAGTA 
TACGCAACTG GAACGGGTCA GGGCGACCTC G3TGACGCCC GCTAACGAGG 7ACAGGAGTC 
CGGCACAC7G TACCAGCTGC ■ ACCAATGGCG TAATTACTTC CGAGACTGAA GTGTTCGCAA 
GGGCGTGTGT GCCTGCGTTA ACTTGCCAGG CAGTT7A777 7TAACAGTT7 GGTG GAAAGT 4500 
GGAGTTAACC TAG AG ATT CT A CTT AAAAT A GCTCATTTTC TCACGAATCT GGTTSAT7G7 4560 
GACTATTTGT GAAACAATAA TGATTAAAGG GGGTGGTATT TCCTCCGTTG TCGACTATAA 
CCTGGCGTGT AAACGTGTAA CCCTGCCAAA TGCCCAGAAT GAAGGACATA C7TACTAAGA 
GTTCCCCGGG AACGGACAAT TCTGAGAAAG ATGAAGCTGT CAT7GAGGAA GATCTAA3CC 
TCAACGGGCA A C C ATTTTTT ACGGACAATA CTGACGGTGG GGAAAACGAA GTCT7TTGGA 4 SCO 

CAAGCTCGCT GTTGTCAACC TACGTAGGTT GCCAGCCCCC GGCCATACCG GTCTGTGAAA 4 66 0 

CGGTCATTGA CCTTACAGCG CCTTCCCAAA GTGG-GCGCC CGGTGACGAA GATCTGGCAT 4 92C 

GGTGACTGAA TGCAGAAACT AAATTCCACA TCGCCGATCC TTCCTGGACG CTCTGTGACA 4 98 0 

CACCACCAAG AGGACCACAC ATTTCGCAAC AGCTTCCAAC TCGCAGATCC AAGAGGCGAC 504 Z 

TACATAGAAA GTTTGAAGAG GAACGCTTAT GCACTAAGGC CAAACAGGGC G~AGGTCGCC 5100 
CCG7GCGTGC GTCTGTAGTT AAGGTAGGGA ACATCACCCC CCATTATGGG GAAGAAC7GA 516 Z 
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CAAGGGGTGA CGCCGTCCCA GCCGCCCCTA TAACACCCCC CTCCCCGCGC GTTCAACGCC 
CAGCACAGCC CACACATGTC CTGTTTT CT C CTGTTTTTGT CTC7TTAAAG GCCGAAGTAT 
GTGATCAGTC ACATTCTCCC ACGCGAAAGC AAGGCAGATA CGGCCGCGTG TCATCGAAAG 
CATACACAAG ACAGCTGCAG C AG G TAT AG A CGGGAAACAG GTGTCTATCT TGGCCGGCTG 
GTTACTCAAA TGGGAACAAT GGCGCCACCT TGCTGTCTTT GTAGG "ATT A GAAGAAAAGG 
ATGCACAACT ATGTTTCCTA GCGGCGAGAT TGGAGGCACA TAAGGAACAG ATT ATTT T C C 
TTCGCGACAT GCTGATGCGA ATGTGCCAGC AGCCAGCGTC GCCAACGGAC GCGCCACTCC 
CACCATGTTG AAG CTTGGTT GTGCCGTCGT CCGGGAGAAC CATGCCAGAC TTTGTGTGGT 
AAGAAGGAAT TGTTATCCGG CAGCAATATT AAAGGGACCC AAGTTAATCC CTTAATCCTC 
TGGGATTAAT AACCATGAGT TCCACACAGA TTCGCACAGA AATCCCTGTG GCGCTCCTAA 
TCCTATGCCT TTGTCTGGTG GCGTGCCATG CCAA7TGTCC CACGTATCGT TCGCATTTGG 
GATTCTGGCA AGAGGGTTGG AGTGGACAGG TTTATCAGGA CTGGCTAGGC AGGATGAACT 
GTTCCTACGA GAATATGACG GCCCTAGAGG CCGTCTCCCT AAACGGGACC AG ACT AG GAG 
" CTGGATCTCC GTCGAGTGAG TATCCAAATG TCTCCGTATC TGTTGAAGAT ACGTCTGCCT 
CTGGGTCTGG AGAAGATGCA ATAGATGAAT CGGGGTCGGG GGAGGAAGAG CGTCCCGTGA 
CCTCCCACGT GACTTTT ATG ACACAAAGCG TCCAGGCCAC CACAGAACTG AC CG ATG C CT 
TAATATCAGC CTTTTCAGGT GTATTACACG TTTCAACTGT AATCCCTCGC AATTG G G T AA 
ACCGTCGGTG TGTAGGGATA AAGCGTAACC TTACGTTCTG TCTCATCTAC AG GAT CAT AT 6240 
TCATCTGGGG AACCATCCAG GACCACGCGA ATTCGCGTAT CACCGGTCGC AGAAAACGGC 630C 
AGAAATAGTG GTGCTAGTAA CCGTGTGCCA TTTTCTGCCA CCACTACAAC GACTAGAGGA 636 0 

AGAGACGCGC ACTACAATGC AGAAATACGG ACCCATCTTT A C ATA CT AT G GGCTGTGGGT 
TTATTGCTGG GACTTGTCCT- TATACTTTAC CTGTGCGTTC CACGATGCCG GCGTAAGAAA 
CCCTACATAG TGTAACACAA AACCATAAAA GTAAATAAAC GTGTTTATTG TTCACATGAT 
AAA GAG T G G T ACTCTTTACT GGTTTGGGGG TTGGGTTGTG GCGTGGTGGC TGGTCCGCGG 
TTCAGTCATC AACCCCCGCC CGTGTTGTCG AGGCTCCTCT TCGTCGCCTG TTATTGGCAC 666 0 

CAGGAGGCGG TTTAGCGGTG CCCCCGTCTG AC ATG CAGAC GTCGATTCTA AGCGAAAGTC 672C 
CCTTCAGGGC ATCGTCCACT TGCTTTTGTG TTACAACCTT GCT3AATATT GTCCTGACCC 
TGGCTTCGAT TTTCTTAGCG GCCGCCGCAC TCAGTGCACC CACAGTAGCG GTAAGCTGCC- 
C - TCZTTZTZ GGTGGCCGTC AGAGGCCGAT CTCTCGGATC GGCAGTGGAT CCCAGTGCTT 
TCCGAAGCTC CCGATTCTCC ACAGTCAATT GGCTTATCTT TGCGGTTAGG TCTTCCATCG 
TAAGGTCCTT TTTGGGTCTG CCCCTGGGCG CGGCCATGTC AGGTACGCGT AGATGTACGT 
GTTGGTGATG CTCACAACAA AAGCCCAAAT CCCTCCTTTA TACCCAGCTT TAAATACTTT 
ATTGAAAAAC CATAGCTTTC GTCAGCGCTT GTGCGAGTAA T CAT ATG CCA 3TCTATGCAT 
GGACCACCTC GTCCACAAAC TTGAAAAAAC AAAGATATAC CAGATAGAAA AATGTGGCCA 
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CGACGACTAG TAACGCGTTA ATCAAGGCCC AGACGCTAGA AAAGCTAGAA AG GGAGGGG Z 
TAAAACTATC CGCGGAACAA GCAACGTCAT AGAATCCTGG GGTAGTGACT GATGTGGGAC 
CGGGCGAAGG CCTGGCGCTG AGCCCAGCCG T ACTGG G ACT AGAACGCTCT GTAGATGATG 
CGACACCTG7 CGAGTTGGCC GTAACCCAGC AGTG AC C TAG TATCGAGGCC AC AAATAAA 3 
CCAGGGCCAC CGTGGACGCT GTCATTATGA ACAACCGCCG AGGCTCCAAG CCGTCTATCC 
AACGTTCCGC GTTCGCCTCT TATATACACT CTGCAATGCA GTCCGACTCT GCCCCTCTAC 
CCAGGGTGGA ATATGTGTTC GAAACAAGCA AATTTAGAAT GACGTCGAGA GCAAATGAAG 
CCAGACTCAG ACTGACAAAT GAGTGTCCGA TACTGGTGAG ACCCCACGAG CCGTTCATCA 
TGCCCACCGG AATACACTTC ACGCGAACCC CTAGCTGCGC TTTCATCCTG ACCGGAGAGA 
CCGACAAGGA TGTATTTTGC CACACGGGCC TAATCGACGG AGGCTACCGC GGGGAGATAC 
AGGTTATTT7 ACTCAACAAG AGGAAGTACC CTGTGACGCT GTATCGCGGG GAGCTCAACA 
TCTGCCTGTC TGCTTTCAAT TACGTGCTAC CTCCGTTGAG GGACGTATCA TTCTTAACCC 
CCCCTATGTA TGCAAACGAC G CCGG ATTTG ACGTGATGGT GATGCACTCT AT GG TT AT C C 
CTCCTACTAC TGACCAAC CG TT CAT GAT AT ATCTAGGAGT GGAGACCCCA GGCCCCCC7G 
AACCCCAC37 GGC7C7AGCA TTGGGGCGAT CCGGTCTAGC ATCTAGGGGT AT AG TT AT AG 
ACGTTAG7GA GTGGGGACCG CGAGGATTGC AG CTGAAGTT TT AT AACT AC TCGGGGCAGC 
CGTGGCTGGC GCAGCCCGGT AG C C G CAT AT GCCAGATTGT GTTTGTGGAA CG C AG AC AC A 
TCCTCAAGGG CTTCAAAAAG TGCTTGCGCC ATAGGAAGCT AGCTCCTGGC GTCCGTTTCC 
GGGAGGCTCG AGTG C ATTTT CGCGAGGATA CAAATAGCGT CCGAAAACA7 ACCCACGAAG 
ACAACCCC37 CCACGAACCC AACGTAGCCA CCGCTTCCGC TGACATTCGT GGAACCAAGG 
GGCTGGG3TC GTCTGGGTTT TAGAGCCGCC GCCAAATGCG GCCAGT7TA7 7AGGGCGA77 
CGATCCCGCA ACCCACAGCA TCCCCCAAAT AAAAAAACGA GTGTACACAG CCAATGTTT7 
7ATTA7TC77 CGA7TCAT7A C7G3TACCAG AGAATAAAGC CAACCTATGT CGAACC7A7C 
GCGCTTTCTG TCG7C7CTTC CAGGG7TGAC GAAGGCCGGG GAGGGATTGA C 3 AATG CAT C 
GCGGAAACGG ACGGGTCTTC GGTGGGTGGC TTGGGTAAAG TTGCCTCCGG C7GGCGCGTA 
ACGGCAGGCG TGAGAGGCAA TACAGAAGTG G3T7CCGACA AGGAGTGGCT GATCTCAGAG 
GCCCATA77A CCGAGTCG7C TG AC G C CAT A GCAGTCGCCA GTTTTTCCAT CTCCAT3AGC 
GAAACGCA77 CCCCGGCCCT TT7G77TAAG A3GGACTGGA GCGCACTG7C G7CCACGGTA 8 BBC 

ATCTCGCCGA CCGCCAAGGC C AG CATTGTG T7CCACACGA CGTTCTGAAT AGACTGCAG7 8 94 0 

7TTTTCACC7 GGGTTTTCAC G3TCTCCTGG CAGCCCGCCG GAATTTTAGC CACGTCAAAA 



CGC7TCAGGT AGTCTGTGAT CTTGTTTGAC TGTACAGCCA GAAGG7AGGT 37GGTGCAGC 

GCCGTCGTGC CAAGGTTCGA C7G3ACAACG TCACCCAGAC ACACTCCGGG GG3GAGGCCC 

AAATCTATC7 C7TGCCGCCA GCG CTCTGGA CAGCCTTCCA GAGGGTCAC3 GAGGCGCTTG 

TAAGCGTGG7 TGCCGCGTCC AAAAAGGTT7 A7ACCGCAAC ACGTCCAGGT 37ACCATGGA 
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GACGACATAC CGCCGCGAGG CGCTGACAGT AAGGG7TATT TT7TGTACGA GTGGCGACAG 
CGCCGAGACG ATCGCCGACG TCCTTACGGG GGCCCCAACG TCAGCGTCCT TC77TTCTG7 
ACTCCACGAC CTTTTTTATT CCCAGATACT CGCCCCCAGG GTAACCCTAA AA7TGTGCCT 
CCCCGCACGG CGTCCTGGCA ACGGCACAAG GTGTTCGCCC GTGTTGGTCC TACGTACTGA 
CGCATCAGTG GCCTCGGGGT 7CCTTGGCGG CCGGCCACTG GAGGCGTCCG ACATTAAATA 
TATGCTGCTC AGCGACCAGA CCGCGGGGTT GTTCAAGCCG CTGTTGGAGA TAATCGGTGG 
CGCGCGCGCA CCACCAAATC AGGACGCGTG CACTTTCCAG AGCCAGGTGG CCTGGCTCAG 
AACGAAATTT GTTACCGCAT TGAGAAAACT TTACAAGATG ACTCCCTCAC CCTACTGGAT 
GCTGTCTGCA TTTGGCGCTC AGGAAGCCCA GTTCGTCCTG ACCAGCTCAT TCTATTTTTT 
TGAACACACT GTGGTCTG7A CCA GAG AG AC AG7TTCTCAC CTGTCTAGAC TGTTTTCGCC 
TCAACAGGGA CAGACGCTGG TTTCCGTTAC CAGCCACGAG GAGCTGGGGC AGCTATACGG 
CACTTCCCCT TTCAGGCGGC GCGTCCCCGC G77CGTCGCT TATGTAAAAG AGAAATTAGC 
G AG AG AC AG T CTGGAGACGG AGGCCATCGA CCGCACCATA GACCAGATCA GGGGCAAACT 
CATGCTGTCT AACCAGGACC TGGTCCATTT CAT AT AT AT C TCCTTTTATC AGTGCCTCAA 
CAAACGGGCG TTCCTGCGCT ACT C TAG AC A GACGTCCTCT TCAAGTGCTC TAAGGGAGCT 
GGGGGAAGAC CCTCAATTGT GTGGCGCCCT ACACGGGGAG TTTCGTGACC ACGTCCAGTC 
CTACTACCAC AAAAAAACCT ACCTATCCAC TTACATAGAC ATTCGGTACG TGGGTGGCGT 
ATTACCAGAC GGCTATTTTG GCGGGAGTCT TGTAGGCGAG CGG7GCG77T A7TGG7GCGG 
GCAGTCAAAG GACACGGCCA GCCTGTTGGC CACCATTAGC CAACAGGTGC CGCACCTGAG 
GTTGCAAAAC GAGTTCGCTG GCATG CTAGA CGTGGCCGCA CTGCGAGGTT CCGATGACGG 
TCAGTTTAAA GAGGGCCTTT TCTCCCACAG TCAAGCCCTA CCCCTGTACA GGTGCGAGTT 
TCTGGGCAAG CAGTTTTTCA CAATGCTTCA GGAAGACGGC CT AG AG CG AT ACTGGGAGCA 
AAGTGTGATA TTTCCAGGCG ACCAGGACTG GGATATGT7A TCTGACAAAG ACCTCACCTA 
CCGAATTTTT TACCATGACC TCAGCCTATC GCTGCCAACA C7GAAGGAAC AGCTCCTTGT 
T7CAAGACAC GAATACTTCA ACCCTCGC7T GCCAG7GTA7 AGATGGG7A7 7AGAC77TGA 10740 
CCTGCCCG7C TGCCGCGACA 77GACAGGAC ATT C G AG GAG GTGCAC7C7C 7CTG77G77C 10800 
CCTGCGTGAG GCCA7AC7CG A CAT C ATT C A ACTCCTTGGA CCAGTGGATC CTCGAACACA 
CCCAGTATAT TTTTT CAAAT CAGCCTGTCC ACCGGACGAG TGGCGCGGCG AAGACGTCGC 
CAGCACCAGC TTCTGTCGGT GTCATGACAA ACTGGGTATG CGTATTATCG TCCCGTTCCC 1098C 
AGAAGGAGTA TGCGTCGTTG GGTCGGAGCC CATGGTGGCA CTCACTGGCA TT CTAAAC AG 11040 
GACGATAAAG CTTGATCCGG AGCTGGTCCA CAGATTCCCG T C AAT AC AAA AAAAGGGGGG 
CCCTTTCGAC TGTGGCATAT ACGGCCGAGG ACGAAGCGTC CGGCTTCCCC AC7G7TACAA 
GGTGGGCTTA GTGGGGGAAC TCTGCCGCCT ACTGAAGATA CTAGTCTGTC ACCCCGCCCC 
CAACGGCAAG GCGCA3TACG TGCGGCGCGC CTTTACGCTT CGCGAAC7GC 7CCA7CACTC 



930C 
936C 
94 2 0 
94 6 0 
9540 
96 0 0 
9560 
9720 
9780 
9840 
9900 
996C 
10C2G 
10080 
10140 
10200 
10260 
10320 
10360 
10440 
10500 
1056C 
10620 
10680 



1086C 
1Q92C 



11100 
I116C 
11220 
11280 



PCT/US97/13346 

WO 98/04576 



176 

CCCGGGCCAC AGCGCCGGTC ATGTCGGCCG AATCATCTAT AG CAT C ATG G ATCGCAATGA 
GAAT7T7TTA GAAAACAAGA C C ATT AG CT A TCTGCCGGCC AAAATACCTC ACATCTTTCA 1140C 
GCGGATAGAG ACCCTATCCG GTCGTTCAAT AG AG GACTGG CTACACTCGG CCGTTTGGGA 
TAAAGCATAC GACACTATAT GTAAATTTTT CCCAGATGAA AAAGCACAAC AGT777C7CA 
CGTTGCATTT ACGCAACAAG GGGAAAACAT CATCCAGTTA AGACCCCGTC AGGGAAGACA 
CTTCCTCTGC AT C AAC CAT A AT C AT AAAAA CAAGTCAAAA ACAGTCCGTG TATTCCTTAC 
CCTTCATTCC ATTAGGGTGA GCGAAGTCAC GGTAACACTT ATGAGTCAGT G77TTGCCAC- 
CAAGTGTAAC AATAATGTTC CCACGGCCCA TTTTTCGTTT GTGGTACCAG TGGGACTGGC 
CAGTTAAT CC CACTATATAA CCTGGCTGCC AGGTTCCCAA AATAGCCCGC GGCATACGGC 
TCACTTCCC C CCACATTCCC CCCGTGCACA ATATAAGAAC CAAAGGACAT GGTACAAGCA 
ATGATAGACA TGGACATTAT GAAGGGCATC CTAGAGGGTA AGTCCTCGTC T AC AAC AG AC 
TTTTCCCATT TCTAACGTAT CGTGCTATCT TCGTCGCCCG GCGGACCATC CZZCZACCCC 
TCA77TA7C3 CGTTTGATAT TACAG ACT CT GTGTCCTCCT CTGAGTTTGA CGAATCGAGG 
GACGA3GAGA CGGACGCACC GACACTGGAA GACGAGCAAT TGTCCGAACC CGCCGAGCC7 
CCGGCAGACG AGCGCATCCG TGGTACCCAG TCGGCCCAGG GAATCCCACC CCCCCTGGGC 
cg _ tcc , ;Ja ;^ lAAAT C7CA AGGTCGTTCT CAACTGCGCA GTGAGATCCA G7777GCTCC 
CCACTGTCTC GACCCAGGTC CCCCTCACCA GTAAACAGGT AC G G T AAAAA AA7CAAGT77 
GGAACCGCCG G7CAAAACAC ACGTCCTCCC CC7GAAAAGC GTCCTCGGCG CAGACCACGC 
GACCG2C7AC AATACGGCAG AACAACACGG GGCGGACAGT G7CGCGCTG2 A22GAAGCGA 
GCGACCCGCC GTCCG3AGGT CAATTGCCAG CGG2AGGATG ACGACGTCAG ACAGGGTGTG 
TCTGACGCC3 TAAAGAAACT CAGACTCCCT GCGAGCATGA TAATTGACGG T3AGAGCCC2 
CGCTTCGACG ACTCGATCAT CCCCCGCCAC CATGGCGCAT GTTTCAATGT C77CATTC2C 126 CC 
GCCCCACCAT CCCACGTCCC GGAGG7G777 ACGGACAGGG ATA7CACCGZ 7C7CATAAGA 126 6 C 
GCAGGGGGCA AAGACGACGA ACTCATAAAC AAAAAAATCA GCGCAAAAAA GATTGACCAC 
CTCCACAGAC AG ATG CT G T C 777TGTGACC AG CCGCC AT A A7CAAGCG7A 27GGGTGAG7 
TGCCGTCGAG AAACCGCAGC CGCCGGAGGC C7GCAAACGC 77GGGGCT77 C37GGAGGAA 
CAAATGACGT GGGCCCAGAC GGT7G7GCGC CACGGGGGGT GGTTTGAT3A GAA33ACA7A 
GATATAA777 TGGACACCGC AATATTTGTC TGCAATGCGT 7TG77ACCAG A.777AGAT7A 
C^TCA-C777 CCTGCGT7T7 7GACAAGCAG AGCGAGC7AG CACTGATCAA ACAGG7GGCA 
7A77TGG7AG CGA7GGGAAA CCGC77AGTA GAGGCATGTA ACC77C77GG CGAGGTCAAG 13080 
CTTAACTT=A GGGGAGGGC7 GCTCTTGGCC 7TTGTCCTAA CTATCCCAGG CATG3AGAGT 1314C 
CGCAGAAG7A TTTC73CGCG CGGACAGGAG C7G7TTAGAA CA27T37GGA ATAC7ACAGG 13200 
CCAG3GGA7G TGATGGGGC7 ACTAAACGTG ATAGTAATGG AACATCACAG C77G7G C AG A 13 260 
AACA3TGAAT G7GCAGCGGC AACCCGGGCC GCAA7GGGGT CGGCCAAA77 7AA2AAGGG7 1332C 
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T7A7TCT777 ATCCACTT7C 7TAAGGATTG CCAAACCCCA TGGCAGAG7G TC7CCCG7A7 13 3S: 
TCCATGTAAC TCACGTAGCC T77C7CTAAT AAACAAGCTA CCTGCAAACT ATACACAAA7 1344 C 
GAAATGAGTC AGGCGTGGTC TC7TCTCTAC CGTGAATCGC AC CTTAAAC A CAACACCAGA 
CCGCCACCAG GTGGCACCCA ACATCCATTA TGGAAAAACC CCGCGCCACC TTCCGCCACG 
TGGAGCCAAC AAACAAGACA CACCCGCCAA TGTT7TGG7C 7C7T7A7TGA TATGATATAC 
TCCCTCCCAT AACAATACGG TGTAGGCATT TTGTATTATT TATTGCATGG CATCCCATAA 
CGGCTTCGGC ATTAT7TCGA G7ACGACGCA GGCG7C7GAG AAAT7AC7GC ACCTCGCCGC 
AAAG7CTCGC GGGGACGGGG CG7GGGGCTC TAACTTGCCA ACCGCCACCG GTTTCCCCAG 
CCACAGC77C ACCAAAGGAC ACGTCACGTG AGAGGGTGCT GG7AACGGTG AA77TG C CAA 
CCCCACCAGA AATG7A7TCG GG7TAAATAT CCTCGTCGG7 TTTCCCTGGG GCAGCAAGAG 
GGGGCCGGAG 7CAGGCGGAA CGG7A7TTCC AATAAAGTGC ACGGGCCCGT TA7GA7AACA 
7ACGCAAAA7 ATGCCAT7AC AAGAGCTAGT CAGCAGAA7G CCTTTTGCAC A7GCG7CCAG 
CG7ATCGCAT AGCTCCCGCT TGGCTA7CTC GCAGGCCAGG TTTGGCACAT 7GGGTAGCCA 
TACCTGGCCC GGAGACCCCA CTGCACAG7A ATGAACTGCG GGG7CCCTAC GCAAGGCCGA 
7GAGA7TCGA CAGCCCGACT GGCTTGTCGT CAGTAAC7CA TGAACCTG77 CGCCA77A7A 
ATACATCC7G ATAAACAACC GACCCCAGTC AATGACGGCC TCCTGACCCT CTGCCGTCGT 
ACAAGA7GGC ACGGGCG7TA CAATCTCGCC TGGCAAGCAC TGCCCCGGGG AAAAAAA7CC 
CTCTTGCAAG AGACGTGCCA TAT 7 G 77 AAA ATCGTGGACG GC7CCGGCCA CGACTCCACA 
7TCCACGCAT TGT7C77CC7 CCG377TACG TACT CTAAAG ACCAGAAAA7 GGTG7CCA7C 
C7GAGAAA7G CCTTTGCCAA 7CTC7TGTAA ACCCCGCG7C CTGCGTAGCG CGGCAAGCA7 
TCGCC7GCGC CCCC7GGTGC C777AAACGA GGCGTCCACG GGCATGT7AC CCCT7TCGCG 
GATATACACA ACACCCAA7T CCCCGTCTCT GCGCCATTCA AAACAGGGG7 CCGCGAGGGG 
CG7AACTGGT A7ACGGAAGC GGGTGCGCTC 7TCGTC7TCC CACTC7AC7C CGGGAAATTT 
TCCACTG7TG ACTTGACATA CTATCCAATC C7TGA7TGAC GCTTTCCCCT CACTGGCACC 1476C 
GGTAGA7A77 C7TAG7TGTC GTGTCCGGC7 CCAC7CCG7T A7CGCAGCCA CCACAGCCTG 14620 
CCGTGTAA7A TCGCCTGCGG CTGCAGAACC CCCGG7CCCG GAGGG7CC77 CTCCCGG7GA 
CTCCGACCTG GATGG7TCAT CGCAAGGAGC CCCGGAGCCA GATGTTC CCG GTGACCC77G 
TGACAAACAA GG77777TGG GTATCGCCCC AGGCGCCCCA AAAGGGT7CG G7C7T7GGCC 
TGGGTCCATT GTCCCGCAAC C AG ACT AG CT CGCGCCGCAA TG7CCAG7GG TAAGCACAGC 
TATGCCGGGG AGCCACCGGC CAT C AG AT A7 AGAGAGGCGA CAGGCTC7CT A7A7ATCACG 
GCTAGGTGGC TGACATATTA GTGGGCCTAG CCGCAGAATT GCC7GGGTAG 7CAAAAACCA 
GCGT77C7CA AATTAACCGA AACTACATTT 7TC7A7T7TA AG7ACGGGA7 ACAAAGCAGG 
GTCTGAGG CA ATC7GCCGCC CTCCACCCCC ACCCACCA7A CCCAAAAAAG ATA7GTCAGA 
AAGA3~ATC TACCTAT7AA CT C G 7G GAGA AACATCATAC AAAATCTGTA CATTAT7TTT 
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AATAC777AA TT7G7GCAGG TTTCTTCACC CCACACC7GC 77777G7C7G G7ACAAAAAA 
CCACTGCAGG GTCCCGCC7A TAGCCAACTC CTAAGCGGGT 7T777GC7AA AG CAC77777 
TAGAC7G7CC C AG AAA CC AC ATAGCTTCCT TTTCACTCAT 77GAAAAACA GCCCCGCCGA 
AC7GCC7GGA GAATTTTCCA CCCCCTCTAC CATTTCGC5C C777ACCGC7 GGTGCGAAAT 
C7AGCCA7CC 7A7CACCGCG GATCCGCTGG AC CAATATAC CACGCCCACT 777CG7AA7C 
AGCAACCCTC TACGCCTACA CCCCTATGAC TGAATATAAC CCCCAACAAG GC7ATGAAA7 
C ATG AATGG T AAC7G7C7GG ACACCAATC7 7CCGCGGGG7 GGCGGCAGTG CGACGCAAG7 
ATC C AC AA7 A AA7GG7GCAA 7AAT7GGCGA AATGTCG7G7 C7GG777A77 7GGAC7ACAA 1584 0 
GA77ACA7CC GG7777A7AA 77CACA7A7A 7GA7CAATG7 AGACTATCCC AAA7GGAGC7 
7A7AAAAA77 77AACAG7CA AGGG7ACAT7 7TGGAAATT7 TC7GTAGATG CCGGGGATGC 
GCCGAAAAA7 ACCG7CCCGC ACG7CACTGG GTTGACGCTC AGCGGTG7C7 G7GGGAT7GC 
GGC7G7GG77 GCCAGG7A7C GCGCGG7GT7 GAACAGC7GC TGCGGAACTC 7GGGGC7AAA 
GC77CGGAGG A7GCG77CA7 AGCGGGAAT7 7GGAT7ACCA AACCACCAG7 C77CCAC77G 
AG7GGCG77T C7GGAG7A7A T7CCAGACA7 CGAGCAAAA7 A77GGGAA7C CG7GGCCAAG 
GCC77CAAAA AC7CGG77CA AAA7C7CCA7 7TGC7CGGG7 GAGGGGAC7G 7AAGACGCGG 
TA7GCGAAGC AG77C7GG7A CGAAAC7C7G ACA7AGGTGC CCCAACG7A7 CCCCAACAGG 
CCAGC7ACAT AACA77GCCT CGCCCGCG7C ACC77CGCG7 C7CAGAGT7C CACGAAGG7T 
CCCA7ACACA AAGA77TCCA CAACAAAAGA CACCCGC7GA C7A7CAGGGG GA7CAAAAAA 
CA7C77TGAA GGTGGCTTTT CGGGACCGGA G7GGCTAACG GGCGTACGCC GCCCGTGCGG 
GGACCTGGAC C7CGGGCGCC GCCTATCCGT GGCCTGTCTG G77GAGGAG7 7CGG77CC7G 
C7GCAGC7CA GACAAAA7G7 TACCCAACCC T7C77CCCAC G7ACA7A7AT C77C7CC77G 
AAG G 77 Z GAG AGCG7AAGAG GGAGACCCAA AGGCGGCGGC ACTAAAGA77 G7TCTGGTCC 166 8C 
A7AACCCCCC ACTGCA7A7C 7A7CTCCAGC A7A7G7AC7A ACAAG7GGAA C7GTGGGCC7 16740 
T7CGCCAC7A CCCGGGCACA CACACTCCCG CCGC7CCAGC TG7G7CGG7A AATG CGAAAC 
C7CGGGG77C ACAGCGGGC7 CCGG7GCAGA A7AAAGCACC G7AGG77GGA AAACGCGCGG 
CCCAC7GAGA GG7AGGGGCG TGGA7GC7AC AG7GGTAGA7 GGGG7A7GGG AATCCCCAGT 
GAGG7CAA7A A7C7CCACTT CGAGGGCACC AGAAC7AG77 G7CACGCG7C 7G7A7CCAG7 
CGCCATG77G TCCCCCTGGC AGACG7ACGG TATTCCAGAC GAGGATGGC7 Cr7G7CG77C 
TGCCACC7C7 GGGGTGGG7G G7GCGCCGGC GGAGGGCG7G GCGGACGCG2 CA7GC7GCG7 
G7GGGAAAGA CCC7GG7T7G GAGCGCCTCC AC7AGACCAC G G AA7 C C AAA GCGGTG7GCG 
AACTTCCGGC ACCACGGCGT GACCAAC7GG TGGG7GGGAA ACAGGCGCG7 G7ATGGG7CG 
CG7AGCTGGC GG77C7GCCA ATGGAC7CCA A7TGTAACA7 GA7GG7TTCG CA7ACCCGGG 
CGCGGGGGCG C7GGGCGG77 GAGGT7CGAA GGGA7ACACC C3CTCACTCG GAGGACC7TG 
AGGAGCCCGG CC77C7G7AG A7GCCCCGCA AGCGCC7TCG GGACGGG777 CCGGGZGGGC- 174C0 
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AAGCCACGCG C GAG C AC ATT GGCCGCTTTG GGGGAGCAAT CCC7G7GGCG CCAGAGGTGC 
ACCC7GGC7G AACTCACCGA CAAATGTTCC CGCTTGGGCG TGCGGCGGAA TCCAACTGGG 
G G C AG C AG G A TTCAGCTGGC TGCTAGGAAT CCCCGTATAT GTCCAACGGG GGGAAAGGGG 
ATCAAATT3G CCCGTGGTTG GCGGATGCAC TTTCTCCGGG AGACCAGACG CGCCCTGAGG 
CCACCATCCC GTGACAGGAA GATCTCCCCA TGGAAAACAC GCAGGTATCC acggggacgt 
AGATGGCAGC CTAGACCCAT CGCGCATGGG AGGGGCTAGT TGCCCCGTAT CCCCCGGCGT 
CTGTGCGACG CCGGAGACCC CTGACACAGT ACCGGCAAGC CGTGTTTCGT GCTGCGGCT7 
GGGCGGCGCC GTGCCCGGTA GGCCTGCACC AGATGAGTGA GGGTCTGAAG GGCCGGTCAG 
CGTTGATGGA GCAGGCGGAT CTCCGGGAAC CCGCCACGTA AAG G AC GAG G CC7GCG7AAC 
TTGTCGCGTC CCAGAGGACC CCATACCTGA GGTAGATGCG CCCTCATTCA CTGGTATCCA 
CACGGAGCAG GCAGCCCTCT GTTCAGTCGT TATATCGCCA A C ATTGTAAT AGCGGTTCGA 
TTTCCGAGGG CGACCCCTCA GCCCCGATGG CGCCTTAGGG GGAGCAGGTG CTGCAGCCCC 
7GCC7CC7CG 7AGC77TGTT CTCTAAGTAA AAGGCACGAG AGTTAACGTG GTTAGGGTAC 
CTAAAGTAT7 TCCCGCCGAC ACCAACGCAT CAAACCTCAC ACCCCCTTCC CCGAGTTACA 
TACCTAGTGT CACTGCGTCG CGTAGCCGTG GTTTGCATTG GGGGGGACAA CAGACACTGA 
ATAAAT C G CT GCAG7TTTTC AGGACCATAC GCGGCCCCAT AGCAATACGT ACAGTTTTTA 
AACGGCGTTC GCACCAACTG CCATACTACG TAG CT AC C AC CAAATGTGTC GCTGTACCGT 
AAATCGTTCC GCACGACGGC CCTCCTGGTT CCACGCAACA GTCTCCCAAA ACGTCCATAC 
ACCGTCTGTC CCACGACAGG CGATGGTCCG T AG A CT CT AT CACACTCCTC ATCAAATGCA 
TGGTACAC CG AATACCAGCC AGGCGGGATA TCGCTGCCGG CAGGCAGGGG CGCGGGGGCT 
GCAAAAAGAA GGT7GTTCCT AT C AAA C C AG GAAAAATAGG GAAACTTATT GTTTTCAAGG 
GCATCAA7AA TCCATAACGT GGCCCATTCT GAGCCACCGG CTTT AG G CAT GGTCCGACAC 
AGAAACCGAT CGGCGTTCGT CTTTG AGG C A CAGTCCCGAC TGAGCCTTAT AGTGCCCCCC 
TTCTTGCTAT GAAAAAAACC CACGACCGTT ACGCAAATTT GAGGAGCTAC TCACCTAAAA 
GTAGCTCCTT TGACAAATGT CCTGG7TTTA TACCAA7TGT TCACAATGAC ATATTGTGCT 
GGCGGAAACA GGTGTCCCGA TGTATCCTCG GCAAGTAAGC ACCATTACCA TGTGCCATCA 
TATTGT3TGG CACAAAAAAA GCAACTTTTC ACGCACGCAG CATAAGACCC GAGCCAGTCG 15020 
CGCCCTCCAT CGCGCCTGCG AATTTTCCCA CCACCCAATA TTGTGGCAGA T CTTT C TT AT 
G TAT AT G TG G TTACAAACAC CACGCCCCTT AAGCTGTCCT CTCTCCCAAG GGGACTAGAT 
TATAACAGTG ACATACGAAA CCGAGACGCT CTCAAATGCT TTCTATTTTA TTTATCGATT 19200 
CCGGGTTAAC ATAATCACAG GTAGCTATAA AATCCCCATC CTCTTGACCT G37AACCC7G 1926C 
GCTTGAGGTT TCCTCTGTTA T C AAACAAAC CTGACCACAA CTGTACAGAG AAAAGTGGGT 15320 

_ „^., r , r r -T-A'™TAAC CACAGCCCGT C AAA C C A C AG 153 SO 

G AAAT G TAG T GTTTATTTT~ .-^^ACACT 
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CGAGGGGGAC CATATCGAAC TGTTCTGCCG ACGTTGGGTC ACCTCCGATG 
TTTTTTTAA7 GTGCTCATGT CCCTGTATGC GATATTGTGC CACATTAAAA ACATCCAGAA 
C AG C C CTAG A TGACAGTCCG C AGAT C AC AC CAAACTTCTT TGGAGGATTA TTTCCATGA7 
ATAATACGGT AG ACTTG C AC AAATTCTTAA CATAAATGCC AGATCGGAGA GAAACTATCA 
CAAGACCCGA AGCAAACGAG CGCAGCACGG CCGCCAGCAG GTTAACGTCT CCTGGCCCTG 
T3TTATTGTC GTCAGGTTTG GGCAACAAAA CTCTTAACCC TTTGCGCGAA TGCAAGCAAG 
AGTGGCTAA7 GTCTGCCAGT GGGTTCTGGG AACATAGAAT AAACACCTTT CGTTCCAC7T 
C C AAAG AC AT TGCAGGGCGG CCAAAATAAA ACACTTCCAC ACCAAGCCTA TCGGTTATCA 
TTACTGGCGG CCGTGCCACT CTATAATATG CGG AT CTAAG CTTCCTGTGG CGAATG CGCC 
TCGTGGTAGG CCTCTCGTGT CTCCGTGGCC CATCATCCCA TAAAAATTCG CCAACAACTG 
GCCGGCGTCT GGACGCCGGC GGCAGTCCAG CACCATCATC G ACTTCTT CG TCACTTATCT 
CCAACACATA TTCCCCTGCT ACATTCTGGG CCTCGAGTGC CCCAGCTAAG TACACATCCT 
C7ACACCCGC CCCGACAGCC GAGGCGGCGA TTGAGCCCTC TGTTACCACG CCGCTTGCAT 
CCGTGTCGCC TCCGGGCTGT GATGTTGCGA TAACATCCTC TGGGATGCCA AGCAGATCAA 
AGAGGTCTTC ATCGCACATC GCCCTCATTA G C AT GT C CAT CTCCTGTCCC ACGTGGTACA 
TCAATGCACA TGCAGATTCT TTATCAAGCA GTGTGAGGTC ATCTTCAACG TTGTCTGTGT 
GCACCGTTGT TTCATCGGCC GGGGGGGGCT GCGAGTCGCT ATGACGCGTC GAGGGTCCTT 
CGTCTCCAGA GCCAGGAGAG TCGGCATTGG CAT CAT C AAC TGGCTGAACC CCAGACGCAC 
TATGGCGCGT CGATGGTCCC TCGTCTCCAG AGTCCTCAGA TTCCGCGCCT GTCTGCGTGA 
CCGGCACATC GCAAAAGGCT GGGTGATCCT CCTCACTGGA ATCCGAGTTT TCACCCACAA 
ATGGCCTACA GAAAAAAAAA CAAATATGTC AACCGGACTA GGGTGGCCAA ACCATTTGCC 
CCACCCC7CC CCACTCTTTC CCCAGGGGAC A CAT CTTAC C TTGGTCTTCT CCGATGCTTC 
TCGAGCCGTA CACTGTGTTG ATA 2 AAAATT TCCCATAGTG ATGACCCACT GTGTAGGTGA 
GTCCTGG2AT GAACGCACCA CCAGCATTCC TTTACCTCGG CACACAGGAG GCGCCACCTT 
CTACAATTAA TTCCCTGTAC GACCTCGTAC TCTTCACCTG GCAAGCGTCT AAGGCGCCGC 
GACGTGGTAC ATATTTTCCC AAAAGCCGTA ATCGGCGAGC CCAGTAAATC TCTGGGATGC 
AGGCCCTTCG ATAGGCATTC CCTCTTAAAA TCAATGAAAA ACTGTAGGCT AT2CAGAGGA 
ATT AC G T CAT TACGGGCAGC CGGAGCAAGA AATGTTCCAG TAG AT CT AT C TAG C C ACTTG 
ACCAAAGGAT ATTTATCAGA GTCCAAAGCA CCTACAATAA ACTCAGAAAT CCAGGTAAGC 
CTGCGTCCCG CCATGTTGAC CTGT2AGAAT GGTCTGCCTC CGAGCATTAC C2CACCTCAA 
CAGAAGTAAT CTACTACGCA AACCACAACA TGCTTCCTGC AGCTTTAACC TTCAGTGACG 2130C 
GGTCAAAAAG CATTGCCTGT ATTAGACACA TGTGTTTCTC AGTATGAATC GTGCTCTCCA 21360 
GCGCTGGCAA GAACATCTGG GGTGATGCTG CCCGGGACCA GCTTTGAAAC AGGGTATTGC 
AAGCCCACAT GTTTGTCTTA CTTTACTAAC 
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GGGACACCCC CTTGCCTTGG C AG CTGAGTG AATCCCAACC GCCTAGGAAA AAAATAACCA 

ctcagacttt attttgcagc cacacggtgg cgctaacctt TAATGATGTC CCACTCAGTG 

AGTTTGGCCA CTCCCAAGCC CACATGGGCC TACTATAACA GG AAA CAT AG AAGTTGCGGA 
TAGAGCCTGG TTTCTAACGG CAATGATATT TATAGTGCAA AACGGAGGGC GGTAAGACAA 
AGGGAGGTAC C C G G AC AG AG TGACAAGAAG ACTTGTCAAA ATTTTAGTC7 CTGTGGTAAA 21760 
ATGGGGCAAG GTAAATGTGC AAAATGACTG GATAGTGATC CGAGTCATAT TCAGGCGACG 2X840 
GCCGGCGGCC CAGAAACAGG GACGCGTACC GGGACCCTTC AGGTTCTCGA TTATGTCGCT 
CCACGTCAAA AGCTTGTTGG ATCTCGTGGC GGTGGGACAG GGGCCTACAT TTGCCTA7TC 
TTCTTCGCGA TGCATTTCCA ACAAAGTATG CTGGGTATTC CAATAATCCC TTCAGAAAAA 
TGCCCATGTT TGTACCGATG GCCACAACTC CCATGGAAAA CCTGTCCAGC GTCTGTTCCA 
AAGTTCGGTT TGCGTCCACA CTACAGTGGG CCGTTCTGGG AAGT AAG CAT TTATACGGGG 
GTACCGTCTG ACATATGTGT TCAGGGGAGG CCTCTGGGAC TTGGGAGCAA ATAACGATGC 
CCCCCGTTAA ATCAAAGTGG GTCTTCACCT TTTCTCCGAA ATAATACACT TCCACCACTA 
GGGGCACAAG CTTGTCACCC ACTTTGTAAA TAGCCTGTTT CTTACTCAGG TATGCTGCCA 
CGGATTGGGT GGCGGTTAAG ACCTTGGGCC TCATGTCGCT TCCATACCAG TAAAATGTCT 
GGTCAGCTTT CTCTTGGTCC TCGACGTCCC GGTCATCACG ACACAACGGT GGAATACAAT 
CAATAAAATC ATCCACATTG TCGGAAGCTT GGAAAGATGA ACCCATGACA GAG G C C C C AG 
GTGCCGAACT CTCAAGGGGA TGCGTGGCGG G AAGT ACT G A GACACTCTC2 GTGGACCCCT 
— TCACCTCC CTCCGACTGC ATCGGGCCCT GAGGGCTCGC AGTTTCACA2 AGAAGTTCAC 
TCAGGTC322 TAAGTCAGGA AGCTCCTGGC CTGAACCCAT GA2AGAGGCC CCAGGTGCCG 
AACTC7CAAG GGGATGCGTG G C G G G AAGT A CTGAGACACT CTCCGTGGAC CCCTCCTCAC 
CTCCCTCCGA CTGCATCGGG CCCTGAGGGC TCGCAGTTTC ACACAGAAGT TCACCCAGGT 
CGCCTAAGTC AGGAAGCTCC TGGCCAACAT CTGACAAGAG ATCTAACAAA CACCCCTCAA 22860 
TGTGATCCAC CATCGGTAGG CAATCATCCA GCCCACTGAC ATGACTGGGG ACGGGGCCTT 22920 
CTGGGGAAAA TGGGGTTTGC GACTGTCCAG CAGGCGGCGC TAATAAGCC7 TGTGTCTCAT 
GTGGAAAAAT AACAGGAGAA GGTAAACCCC CCGTTGGCAA ACATAGATCC GTCGGGGTGT 
G CACG7GTAA TGGGCCCTGC ACCTGGCTCG TGGAGGGACG CGGGGAATCC GGAGCTAATA 
AGCTCGATGA CTGACCAGAT GACCCAAACC CCGACGGTTC TGGC7CTTCA AAAAACAAAC 
TGTGCATATC CCTCCCTACA AAACCCTGAG C22C2ACCCA AAGTTCGTTT TCGCTGTCAC 
TCGATTCCGT ATCTTCGCTC TGTGACCGTG ATGAAACTTC AGCTGCGGAG GATGTTGTGG 
GCGTGGCGAC TGCCGCCGCC TGTTTCCTGG CGGCCTCCCT AAACAAAAGT TAATTACACA 
AAGGTAAG7C TGAGTGACAT CT'JCAATTTC CCGTGATGCC CGCTGCACGT A2ATCCCGC2 
GCCCACACAA CCCACCGCCC AGTACATCAA CCATCCTACC TCTGGGCTTT TTTTCTAAGG 
CTCCTT2TAA GTGCCTTTTC TCTGTGTTTG TCATCATGGG GATAGATCC2 AAA2AATGCT 
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TTTAGCA7G7 T77TCA7GGC TGG77CCTGC GTCAAGTACA CAAGACA7CC TTCACATCCC 
TTGTATGGCC TAGGTGTCAT AATCCAGCGG TTGAGTTTCA 77T7TCCC77 ATAGATGG7A 
AAGGGCCTCT CCTG7C7GGC TCGA77GGCG G7CC77AATA GCCGTCCAAA GCAGCCCAGG 
CCAGTCTCAG TCTCCGGGAT 77C7GGCAGC CCGTGCCTAC GTCGCTCCTC C AAAAATG Z Z 
TCATAGAAGT CATCGAAGCC TTCTGGCATT CTC7CCCGCC GG77TCGACC CGGCACGGTG 
AATATTGTCT TT7G7TCATC CAACCACCCT ACCCCCCAGA AGCGTCCACT GTC7AAAGCA 
T CTATAAT AA AGTCCGTGAG CCATTCCGAC TCCGTGTAGC G AGG CAT C77 TTT AG G C AAA 
AGCCACGACA CAAAACACCT 77TCCGTGGG CGAC7TTCTC GCCACAAC7A GC7GGACCCC 
AACCCCAC7G GCACGTAGAC 7CTGTGCCAT CTAACAACAA AACTCAATAT A7GCAGC7CA 
ACACCGCCCC CCCCAGCCGG TTGTCGGGCT GCGGAAACTT GTGGTTAGAA C7CACTACGG 
AAAAGGGAAC CAA7G CAGTT GAACTACTGG CACACACCCA TAACCCGGGA CAGCACCCAG 
GCACTGTCCA CCCTCTAATA CAAGCGGCCT TTGGACGCGA GGGAGGGGTG TCATGGTCAA 2424 0 
CAAACCAAGA AAAACACATG TATTATTCAA TT AG Z CAAC A ACTTTATTTA TTACCGACAG 24300 
GAGACATGAG ATACATAAAT T7CCAACCGT GCATAGGGCC AATACCATCT G7GGAGCG77 
AAG7GCCCTG TGGAG7777C GCCTAA7TAG C7GAATCTCG ACCCCCA7TG CGGCCAGCAT 
GCTCACGAGG AA7 AG G C AG Z AG AGG CAGGA CCTAACTAGG AGCATATCCG G A Z CTG AT Z Z 
AAGTA7G7GC ACCAAGG7GA GCAACACTGC CGCGAAAGGC AGGAGAACAA ATAGCGCTCG 
TCGGGAGGCG ACGGATACGC CCACGCA7GA CAGTAACCCA A C AT AAAA7 A GCGTCATATA 
CTTATC G AGG CCAA7 CAGGA CCGGAG7CAG CAGGCCGA7C GAGGCCGTCG ATATCAGGGT 
GGCCAGCAGT AAGGTCACAA ACACGACAAC CTCGCGCCTA CAGTAGGCCC AGGCCTGGAA 
CACTGAATAG GTGA7G7AC7 7CCCGGGGAT GATGAATATG GCCZZZZZZZ 777 G C ATT Z Z 2 4 780 
GGCCCTGATG TACACATGCT GTTGCAGGTG CCTAAATGCC AAAAGTCCCC CGACCAAGAA 24B4C 
GAGAATGAAG GG C AG Z GAGA AAACGCCGGA CAGAAAGAC Z TT CTT AAA C A AGAGAAGGTA 24 900 
GT AC A C CAT A AATGCTCCGC AGAAGCCCAG CTCATAGTAC CTGTGTACTA 77GGCGGCGC 
C7GA7ACACC GCCG7TGCGG TGGCTAGCGG A7AAGG7AAC AG C AG 7 AAA C AGTTAAG7AC 
GCACAGACCC GG TAT GAAGG GCACACGAGA AAA7GTAAAC C C AGAAAAG G CCGCGCAAAC 
TACAGCAGCA AACACTGCTG ACGCGCAGAT CCATTCGAGG CTCCGGTCCA GCTGTTTTTG 25140 
CGCCGCAGGG CACAGACACA TGCATATCAG GGCCAAGTGC GTGACTGGCA GCGACCAGAA 25200 
AAACACGG C C GTGATCTCTG 7GGTAAAGAG TGTGAACGAG TACAGGGCCT T G AAG AT AAA 2=260 
ACACCACAGA AAGGGGGTCG CCGCCAACGT CCCGCTCAGA TAACTGAAGA GCGACAGAGC 
GCGCTCACTG TCCAGGCGGC ACATGGTGTC AAATCAGGGG GTTAAATGTG GTTTTGGGCA 
CCTTCCCACG ATCCCTGGAC TGGCTCGAGT CTGAGCGCCT CTTGTGAGGC CTCTTTGTGC 25440 
TGTCCTTAGT TGGCGCCGCT GGGGGGCAGC TGGTGACAGA GGCAGCGTCC T TAGAGG GGT 
CCTCCAGCGG C Z CAAAGGGA CCAACTGGTG TGAGAGGGGG AGAATCCGGA GACTCCAATT 
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CCGGCTGCCT CCTGGAGTCC GGTATAGAAT CGGGAACCTT TTGCGAAGAC 
CGGCAGACAC AGATCGGTTT ACCTCTAAAA GTAGGACACT TAACTTTACG TCACCTGAT7 
GGCAGCCAGT G GG C AC AC CT TCCACTTCTA ATATTTCG7T GGAGTGCCAA ATCAGCCCGG 
GGGTAAACCA ACCCGGGACT TTACACAGTC TCAGGGCGGC GAT7AAGGAC TCCAGGCTAA 
CCCGGCTCAG GGCGTCGGTG TGCACCACGC CCACATCCAC CGACTTCT7C CCCTTCAGA2 
CATCCCAGCC AGAAACGGGT TTGGTTTCTG GCTTGAAATC AATGATCTTG CTCACGCCAC 
CAAGAGAAAA TGTCACGATC GACAGCGTCT CGCTGACAGA CACAGTCACC GTTTGGTCCT 
CTTTTGTTTT TTGCTGCCTT AG C C ACTTAA GTAGGAATGC ACCCGTTTTG CCACAGAGGA 
GAAGCCTGGT GGTCCTACCA CCGGCTTCCA TCCGATCGTG GAAAGGTAGG AT A C C CTTT7 
GGTCCACCAC GCTTTTGTGC ACGGTGGAGG TGAGGTTGTC CCCGTAGGAA ATGGTGGTCZ 
TGACGAACTG CGGTTGGGCC CCCGTATCGC ATGCCTCCCC C7TTCGATAA AAGGCTATGC 
CAGCGTCGAG TACATTCGCA CCGAATAGCT CACGCGTGTG CGTGAAGCCG CTACCGACGG 
ACGTATTCGT GAAGCTGAAG CTAACGTCTC CACTGCCTTC CGTGTGTCCC ACCAGGGGCG 
TAAGGGCATT CTTTATTCTT AACCCCAGAA CGCCAGCTGT CCCCACGCTG G AC AG C ACAC 
TGAGGGTTGG CGTGCAAGCC GATCCGTGCA CTTGCACTAC TCCGGTTTTA GTGGCACTCT 
TAATGTGTTC ATTGACCCTC CTGATTTTAG ACAGGAGGGT CACGTCCACC CTGACCCCAT 
AG TGAAAAT C CACAGGCATG ATTGCGGCCG TAGACGCACA GAGAAA7CAC AGGAAAGCTG 
CGCGCACACT GGGTGATCTG GAGA CG AT AG ACTGCCTTAA ATAGAACTTT TAGGGGAGGT 
GGAAGTGTGC GACATGGACA GGTTAACCTT CACAAATCGT CAGTCACACA CGTGGTGTAA 
TCAGAATTGT CTCGCTCAAA AAAATTCACA GCCTTGAAAC TGCCGGTGTA TGAGAGGGGG 
CACGCTT27G GCGGAGGCGT GCCAAATATG GGAGGAACGA AAATATCACG C AG AAT C CTG 
TCAGCGGTGG CTTCCAGGAA CCTCCGGATG TCGACCACG7 TAACAAGCG7 CACCCCGGCC 26B80 
GCCTTGG2CT GGATAAACCG AATCTCAATA TTCACTGCCT CCCTGAACAG CGCCTGGACC 2 6 940 
TCTGCGTGAC TGGGTTTTTC CTGTATCTC2 ACCATAGTG7 TGTACAACAT ACTGGCGGCC 27000 
TTGGTGTGCA GCAGCTCGTC CCTGGAAATG TAATGGTTGG CAAGGCACAC CCCGGGCATG 
ATGCCTCG2A CCCTGCACAA ACTGATAGAG 7AGAAGGAGC TAATAAAGTA TATCCCCTCC 
ACAATCAAAA AC AT C AG AAT CTTCTGAGCT TTGGTGGTCG CCTTACGCAC CCTGGAGTGA 
AGCCACTCCA gcTTCTCG-A AAGGGCGGGG TCCAAAATGA TCTTGGCAGZ ATATGCTAGA 
AGTTCGCCTC GACTGTTGTT GAAAAATATC TTCAAGATAT 7GG2ATACAC GACACCGT3G 
ATATTCTCCA TGGCAACCTG TTCGGCATAA TAGTGGGCCA CGTCGTGGCT GTTAAAATTT 
GTGACAAGGT CCTCAATGTT AAAGTTAACT AGG2GTTCGG CCATTCCCAA AAACGTAAAC 
AAAAATCTA7 AAAAGTCCTT GTCGGCATCG CTGAGCTGGT GZACGTGGGA AACATCAAGG 
TGCAGGGGTA TCTGGCTAGG AAACCATCGG TTCTGCCAAG TCTCGCGCGT TAGGGCCAAA 

rCCATTG"- C2TCACCCGC- 2 7 S 0 0 
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CT7GCASAGA CCTACCTACT GACAGACCAG GCACTCGGGG TCTGCCGCGC AGGACTC77I 
CTCCGGGTTT TTAGGTCCGG G7AACCACGC CCCATCTTGT T7CATCCCAG AGTGAGGCGC- 
TGACCCTGGA TCTGCCAGGC ACTGAAGAGC CGTCAGACTA GATTGCTTC7 GAACCCTACA 
GTAGTACATG AGGGTTTTTA GACCAAGCCT GTATCCATGT AGCAGCAGG7 CCC7AAGA7A 
GCTCGCATTC CTGACTCTGT CCTCCTTGAG GAAGAAGCTC ATGGACTGGC TCTGGTCTAC 
AAACGGCGCC CTGGCACGAG CCCTGTCCAG TAG CTTAAAT GGACAGTAAT CAAAGGCTG7 
TAGGAATACC CTATATCTTT CCCTGTGATG CTTGGGGAAC GTGGAAACGT CCCCACCATA 
CTG7C7AACC ACCCGAAGGT CGTCGGGGAG AACCTTCTTA AAAAAAGTCA CATTGGGCC7 
CAACACC7CT TC7TTATTGG TGACCTTGGA AGATATATTA G C AAAAAAG 3 GGTACACAGA 
CTCGGCATAG CCAGTTAC7T GCGAGGTCCC AGCCGTCGGC A7CACCGCCA GAAAC7GAGA 
A7TGAA7ATG CCATGCTCGG CAATGCTCTT TCCCAACGCG TCCCAGCGAT GGCGTG3TAC 
AAACGAAGCA TCCTCCCCCT CCCATGTTTG CCAATGAAAC CTGCCCTTGG CGAAGTTAC7 
GACCTCCCAG CCATGAAATG GGACACCCTG TCC7TCCAAA ACAAGGTTG7 GACTAGTCTC 
CACCGCGG7G TAG7ACATAG ACTGGAATAT A7TCTTGTCT AACTCAGCGC TCTCAGCATC 
GAGGTACCCG TACCCCAATT CCGCAAACAC A7CCGCCAAC CCCTGAACAC CAATCCCCA7 
AGACC7CTCC T7TTGACC7C GCTCGACCCC CGGTGTTGGA TGGGAACCAC CCAGAATGCA 
GGCGT7GATG ACGAGGACTG CCACCCTTAC TGCGTCGCCC AAGGCCTCAA AACAAAAAAA 
CGGC7TG7TG GCG7CCGTGG TGCCAACCCT CGCGCTTTCA ACAGTTCTCA GACAC777GG 
AAGG CAG AT A TTTGGCAGGT TGCACACCGA AGTGT7TCT7 CCTGGCAGT7 GGAC7AT777 
TGCACACAAG T7TGAGCAG7 T AA7G G C CAT GC777GAGTG TCGGTCCAGT GGTGTTCAT7 
GAGCG 77777 77TAAAAGCA CGTACGGTGA GC77GTCTTT ATGATGG7G7 GGATAAGAG7 
G AA 7 AT CAT A GACTTCAACG GCATGCAACT AAC37A7T77 CCAGCCCGCA CCAGGCGC7C 
GTA777G77A T CGAACGCAG CACCGTATAG C77 AAT C AAA T7GGGGGCGG TGGCTGGATC 
GAACAAATAC CATAAC7TGG A7GGGTCCTT 77CA7ACATC C7GAAAAACA A7G77GGGA7 
GCACACGCCC TGAAAGAGAC TGTGACATCT GTCGGGATTC TCCGGTAGTT TGGCG7TCAA 
AAAATCACAG ATTTGACTGT GCCAGAGTTC CA7G7ATGCG CTCGCGCCAA CGGGCCTGAT 
GTTATTG7CA 77GAAATAA7 GAACC7GGGC ATCCACCAGT TTGAGGCAAC TGG7TA7G77 
CT7TTGGTGG GAGAATGACG TAACATCCAG ACCCACGCCT GACTTACTGG C CAG C AACGG 
ACTCATA7CG TGGTACAGGG CGTCCAAAGT ACCCGACTCA TTCA7CA7GG AGG3C7GCAG 
AATAAAACAG CTGGCGAG7T G7CCGCCTTC GAC7CCAGCT GAGCGCAGTA 773GCG7GGC 
GCAGCACACG TGCTGCGCAG CGAGG7AGCC AAAAACGTAC T C C ACT AT AG CCATC7CAGA 
TACAGAC7TA GCGTCCTCAA TAAG3TCCCG CGCCAACCAA TACAGGCAT7 CA7G CTCTAA 
GCACTGACAG GCAACAAACA CGGAAACCCT CATAAACA7T 7GCGCCACGC 7TTCATAGAC 29580 
AGGC7C73TC CCCATGGTCC TTAGGACGTA AG7ATCATAC AACCTCACGG CCGA7AGG7A 2 964C 
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GCCACAGTTA AGTGTGTCCT CGTAAGCTTT GGACCGTCTG TAGGCGCACA ACATATC77C 
CAAGGCATCA ATGTTCTTTT GAATAAACGA TTCCACCCGA TGTCCCAACA CGCCTCGAAA 
AATCCCAAGA TACTGC7TGA GAGTCGCTGG GCACC7AGCC TCCATAATTT GGTGCCACAG 
CCGCCCCGCC ATGGCATTGG CCCGCACGTC CCACCCGACC CTAACCTTTA GAAAGTCTAT 
GAGAGATTGG GC AC AC AT AT CAAAATCCGA CAATTGTCCC GCAGACACCT GAGACCCG2G 
TCGCTCTGGT GGGACAGCTC CCAAGTGAAC CTGACAAAAT GTCCGGACAG ACATGACCTT 
ACAGAAACAC AGTCCAGGGG CCACACGCGG CCTCAAAGTT CGCAAACACC AGTACAGGCA 
AGGACGTGCC CTTCACGTTC AGACTTTGGT GCACCGGATG AGAATCAAAG GGAACTGTGC 
CCAGCGTACA AACCGCCCCA AAAACAAGCC GATTTATATA CAGCTCGTGC CTCAGCTGAA 
TATACTTGGT CCGGATTACA TCCGTAAAGT GATCCTTTAT CATGGCCACA ACCTCCGCAA 
AGCCCTTCCC AGACTGGAAA AACGTCAGCG CCATAGATGG TCTCTGGTTC ACACGGAGAT 
AAACCAACGA GGCATAAATA GTAACGTTTA GGCCTGCCGG TTCCCGGCGC TGGACCATGG 
GACATGACTC ATCCAAATCA ACT AG CAT AT CACAAGGGAG GGTCAAGCCT ACGTGTGCAC 
GGGGCTCGTC CCGGGCCAAC CCAACTCCCT TCATGGCGGA GGTGACCTTC- GTCACGAAGG 
TACTGTGGAC ACTCTGGACC ATTGGACCTA CTGGGGTAAG GAGGGTATGA AACTCCCCAG 
TGTCCATGAG TTCACTCAAG TTAGGGATGA AATCCGCCAG GCCGGATCCA CTTCCGTACC 
ACACACCGGC CACTTTGTGA GTCTGTGGCG CTTTTGCCGC TTCCATTCCA GAGAGCATAA 
ACAGGGACGT GGGTGTTAGC AGCATATCCA T AG AC GAG CC GTTGTCCTCC TGCTTGAATG 
AAAATAAAAA GGTTCCCAGA GGCTCCTGGG GACTAAAGGT CTGTGAATAC ACGAGGAAAT 
CTCCATAGGT CGGCTGCCTA AACGGCGCCT GCCGCAAGGC CTCATGCAGC GAGCCAACCG 
TGGGTCGTGT GGACGCCGCA TATTTAGAGA GTAAATCCCG CACCCCCCTG GCAAACTCCG 3090C 
GTCCTCTAGT GAGGGATACC CGGTGAGTTG GTGGAGGTAA AAGACCCAAC ACTTGCCTAC 30960 
CCAGGCGAGC CGCATTTTCA GCCTGCACCT T C AT AT CC AC GCCGGCAATG GACGGCACAG 
ACGCTCTTGA AAAGCTTACC AAAGGCCTGA GTGGGGGAGG CGGGAGCCTT CACCAGACAA 
AG CTG TTG AT GGAATTTCAA CTCC3AGGAC 7GCCGGTGCC TGCCCTCTTA AACAGCAGCA 
C AAC AG AG C A GTTTTTAAAT ACTGTTGCCC AACTGCCGAC GGACCTATCA AAATT 
GCGACTATCG CGTGTTCGCA CTGGTTCGCG CG3CGTATT7 TTTAGAACCC CCTTCTAGCA 
TCGACCCCCT TGAGGCAGCG CGCGCTCTTG GAC3CCTGGT TG AT AT ATT A T CAT C AC AAC 

, , ^ ^,,™ C ,- GA "GACACCCT 3 AA7AACTGTA 3 13 SO 

CACCGCAGAA CACCGCACCG G„o-~^~ hC ^'^ Gn ^ 

CATTGCTCAA AC TA CT AG CC CACTACGCGG ATCAGATAGC AGGTTTCAAA ACCCCC3CTC 31440 
TCCCTCCCGT GCCACCTGGA ATCATCGGCC TGTTCACATG CGTGGAACAG ATGTACCACG 
CATGTTTTCA GAAATACTGG G C AG C TG C AC TACCCGCAAT GTGGATACT3 ACATACGACC 
CTCCCACTTC TCCGTTACAG GACTGGCTTA TAGTCGCCTA TGGTAACAAG GAAGGACTGC 
TACTCCCCTC TGGCATACCC TCGGAGGAGG TGTTAGCCAA AACATTAGTA AC. 
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ACGAGTTGT7 CGTATCGCGG TCGAATTCGA CCGAGACCGC CGTCACCATG CCCGTATCCA 
AAGAACGCGC CCTCGCCATC TACCGGGTGT TCGCCAAGGG TGAGGTGGTG GCGGAAAATA 
CTCCCATTCT TGCCTTCACC GACGTGGAAC TAT C C AC ACT CAAACCCCAC TATCTGTTCA 
TCTATGATTT TAT CAT AG AG GCATTATGCA AG AG CTAC AC ATACTCATGC ACCCAGGCCC 
GCCTGGAATC CTTTTTGAGC CG AG GTATAG ACTTCATGAC TGACCTAGG7 CAGTACCTAG 
ATACCGCTAC TAGCGGCAAG CAGCAGCTGA CGCACAGCCA AAT AAAG G AA ATCAAATACA 
GGCTGCTAAG CTGCGGTCTC TCGGCTTCCG CGTGTGATGT TTTCAGAAC7 GTGATCATGA 
^CCTCCCATA TCGACCGACC CCCAACCTCG CTAACCTGTC CACGTTTATG GGGATGGTTC 
ACCAACTGAC CATGTTCGGA CACTATTTCT ACCGGTGCCT GGGCAGCTAC AGTCCCACCG 
GCTTGGCCTT C A C AG AATTG CAAAAGATAC TGACACGCGC C AG CGCGG AG CAAACGGAAC 
GTAACCCGTG GAGA CATC CG GGTATCTCGG ACATTCCACT GCGTTGGAAA ATATCG CGTG 
CTCTAGCATT CTTCGTCCCT CCGGCCCCCA TAAACACTTT GCAGCGCGTG TACGCCGCGZ 
TGCCCTCGCA ACTCATGCGG GCCATCTTCG AGATCTCGGT CAAGAC CACA 7GGGGAGGCC- 
CCGTACCGGC AAACCTGGCG CGCGACATTG ACACAGGACC GAACACACAA CATATCTCCT 
CCACACCACC GCCCACCCTC AAGGATGTTG AGACATACTG TCAAGGTCTG CGGGTGGGAG 
ACACGGAGTA CGATGAGGAC ATTGTGAGAA GCCCGCTCTT TGCAGACGCG TTTACCAAGA 
GTCACTTGTT GCCTATACTG CGCGAGGTTC TGGAAAACCG C CTG C AG AAA AA C AG AG CT C 
TGTTTCAGAT AAGATGGCTG ATAATATTTG CTGCCGAGGC GGCAACCGGG CTCATCCCTG 
CCAGGCGCCC GCTAGCCAGA GCCTACTTCC ACATCATGGA CATTCTGGAG GAGAGACATT 
CCCAAGACGC CCTATACAAC CTTTTGGACT GTATCCAGGA GCTCTTCACC CACATCAGGC 
AGGCTGTTCC AGACGCACAG TGTCCGCACG CCTTTCTACA GTCCCTGTTC GTCTTTCAAT 
TCCGCCCTT7 C G TACT C AAA CACCAGCAGG GTGTAACCTT G TTTCTAG AT G3CTTGCAGA 
CATCCCTCCC CCCGGTGATA AGTCTGGCCA ACCTTGGAGA CAAGCTGTGT CGTCTCGAG7 
TCGAGTACGA CAGCGAGGGC GACTTCGTGC GCGTGCCAGT TGZ^CCGZZ^ GAACAACCAC 
CGCACGTACA TCTGTCGCAT TTCAAGAAGA CAATACAGAC CAT CG AACAG GCCACCAGGG 
AGGCCACCGT AGCCATGACA ACAATCGCAA AG C C AAT AT A CCCCGCCTAC ATCCGGTTAC 
TGCAGCGGCT AG AAT AT CTT AACAGACTCA ACCACCACA7 TCTCAGGATT ZCZTTZZCAZ 
AGGACGCCC7 TTCTGAACTC CAGGAAACCT ACCTGGCGGC GTTTGCACGG TT3ACAAAAT 
TGGCAGCGGA CGCAGCAAAC ACTTGTAGCT ACTCCCTCAC CAAGTACTTT 3GAGTTTTAT 
TCCAACACCA GCTGGTCCCC ACGGCCATCG TTAAAAAACT GCTACATTTC GACGAGGCTA 
AAGATACCAC AGAAGCCTTT TT AC AG AG CC TGGCACAACC CGTAGTGCAG GGACAACGGC 
AG3GGGCGGC TGGCGGGTCG GGTGTCCTGA C G C AG AAAG A ACTTG AG CT Z TTGAACAAAA 
TAAACCCACA GTTTACAGAC GCTCAGGCTA ACATTCCTCC AT CT ATT AAA CGTTCATATT 
CAAATAAATA TGACGTCCCT GAGGTCTCAG TCGACTGGGA AACGTACTCC CGGTCTGCCT 
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TCGAGGCACC GGACGACGAA CTCCGTTTTG TCCCACTGAC GCTGGCAGGC CTCCGGAAAC 
TGTTTG7CGA ATAGAGGCCA TGGCAGCCCA GCCTCTGTAC ATGGAGGGAA TGGCCTCCAC 
CCACCAAGC7 AACTGTATA7 TCGGAGAACA TGCTGGATCC CAGTGCC7CA GCAACTGCG7 
CATGTACCTG GCGTCCAGCT ATTATAACAG CGAAACCCCC CTCGTCGACA GAGCCAGCC7 
GGACGATGTA CT7GAACAGG GCATGAGGCT GGACCTCCTC CTACGAAAAT CTGGCATGCT 
GGGA777AGA CAA7ATGCCC AACTTCATCA CATCCCCGGA TTCCTCCGCA CAGACGACTG 
GGCCACCAAG ATCTTCCAGT C7CCAGAGTT TTATGGGCTC ATCGGACAGG ACGCGGCCAT 
CCGCGAGCCA TTCATCGAGT CCTTGAGGTC GGTT7TGAG7 CGAAACTACG CGGGCACGGT 
ACAGTACCTG ATCA77ATC7 GCCAG7CCAA AGCCGGAGCA ATCGTCGTCA AGGACAAAAC 
G7A7TACA7G T77GACCCCC ACTGCATACC AAACA7CCCC AACAGTCCTG CACACGTCAT 
AAAGACTAAC GACGTTGGCG 7TTTATTACC GTACATAGCC ACACATGACA CTGAATACAC 
CGGGTGC77C CTT7ACTTTA TCCCACATGA CTACA7CAGC CCAGAGCACT ACATCGCAAA 
CCACTACCGC ACCATTGTGT 7CGAAGAACT CCACGGGCCC AGAATGGATA TCTCCC3CGC- 
GGTGGAA7CA TGCTCCATCA CCGAAATCAC G7CCCC7TCT GTATCCCCCG CGCCTAGTGA 
GGCACCAT7G CGCAGGGACT CCACCCAA7C ACAAGACGAA ACGCGCCCGC GCAGACCTCG 
CGTCG7CA77 CCTCCTTACG A7CCGACAGA CCGCCCACGA CCGCC7CACC AAGACCGCCC 
GCCAGAGCAG GCAGCGGGAT ACGG7GGAAA CAAAGGACGC GGCGGTAACA AAGGACGCGG 
CGGAAAGACG GGAC3TGGCG GAAATGAAGG ACGCGGTGGC CACCAGCCAC CAGACGAGCA 
CCAGCCCCCA CACATCACCG CGGAACACAT GGACCAGTCC GACGGACAAG GCGCC3ATGG 
AGACATGGA7 AGTACACCCG CAAATGGTGA GACA7CCGTT ACGGAAACCC CGGGCCCCGA 
ACCCAATCCC CCAGCACGGC CTGACAGAGA GCCACCGCCC ACTCCCCCGG CGACCCCAGG 
CGCCACAGCG CTGCTCTCTG ACCTAACTGC CACAAGAGGG CAGAAACGCA AATTTTCCTC 
GCT7AAAGAA TCTTA7CCCA TCGACAGCCC ACCCTC7GAC GACGATGATG TGTCCCAGCC 

(2) INFORMATION FOP. SEQ ID NO : 2 0 : 

ii) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 22 07 base pairs 
<B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY; iir.t-"" 
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(ii) MOLECULE TYPE: DNA (genomic) 



(X1 , SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CTCCCAACAA ACGGCTCCGG ATACTGAAGA TATTTGGATT GACGACCCAC TCACACCCTT 

GTACCCAC7A ACGGATACAC CATCTTTCGA CATAACGGCG GACGTCACAC CCGACAACAC 

CCACCCCGAG AAAGCAGCGG ACGGGGACTT TACCAACAAG ACCACAAGCA CGGATGCGGA 
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CAGGTATGCC AGCGCCAGTC AGGAATCGCT GGGCACCCTG GTCTCGCCAT ACGATTT 
AAACTTGGAT ACACTGCTGG CAGAGCTGGG CCGGTTGGGA ACGGCACAGC CTATCCCTG7 
AATCGTGGAC AGACTAACAT CGCGACCTTT TCGAGAAGCC AGCGCTCTAC AGGCTATGGA 
TAG GAT ACT A ACACACGTGG TCCTAGAATA CGGTCTGGTT TCGGGTTACA GCACAGCTGC 
CCCATCCAAA TGCACCCACG TCCTCCAGTT TTTCATTTTG TGGGG CG AAA AACTCGGCAT 
ACCAACGGAG GACGCAAAGA CGCTCCTGGA AAGCGCACTG GAGATCCCCG CAATGTGCGA 
GATCGTCCAA CAGGGCCGGT TGAAGGAGCC CACGTTCTCC CGCCACATTA TAAGCAAGCT 
AAACCCCTGC TTGGAATCCC TACACGCCAC TAGTCGTCAG G A CTT CAAGT CCCTGATACA 
GGCATTCAAC GCCGAAGGGA TTAGGATCGC CTCGCGTGAG AGGGAGACGT CCATGGCCGA 
ACTGATAGAA ACGATAACCG CCCGCCTTAA ACCAAATTTT AACATTGTCT GTGCCCGCCA 
GGACGCACAA AC CAT T C AAG ACGGCGTCGG TCTCCTCAGG GCCGAGGTTA ACAAGAGAAA 
CGCACAGATA GCCCAGGAGG CTGCGTATTT TGAGAATATA ATCACGGCCC TCTCCACATT 
CCAACCACCT ccCCAATCGC AACAGACGTT CGAAGTGCTG CCGGACCTCA AACTGCGCAC 
GCTCGTGGAG CACCTGACCC TGGTTGAGGC GCAGGTGACA ACG CAAACGG TGGAAAGTC7 
ACAGGCATAC CTACAGAGCG CTGCCACTGC TGAGCATCAC CTTACCAACG TGCCCAACGT 
CCACAGTATA CTGTCTAACA TATCCAACAC TCTAAAAGTT AT AG ATT AT G T AATT C C AAA 
ATTTATAATA AACACCGATA CACTGGCCCC ATATAAACAG CAGTTTTCAT ATCTGGGGGG 
TGAACTGGCA TCTATGTTCT CCCTTGACTG GCCTCACGCA CCTGCAGAGG CG GT AG AG CC 
ACTACCCGT3 CTGACTTCTC TGCGAGGTAA AATCGCA3AG GCGCTGAC3C 3 TC AAG AAAA 
CAAAAACGCT GTAGATCAAA TTCTAACCGA CGCC3AAGGC CTCCTTAAGA ACATTACCGA 
TCCAAACGG2 GCACACTTCC ACGCCCAGGC CGTATCAATT CCAGTGTTAG AAAACTACGT 
ACATAACGCG GGGGTCCTTC TCAAGGGCGA AAAGAGCGAG AG3TTCTCCC G3CTGAAGAC 
CGCCATCCAA AACCTGGTAT CCTCCGAATC ATTT AT CACC GTGACCCTAC ACA3TACAAA 
CCTTGGAAAC CTAGTTACCA ACGTACCAAA ACTTG3TGAG GCGTTCACCG GG3GCCCGCA 
CCTCCTGACA AGCCCGTCCG TGAGACAGTC CCTTTCCACC CT3TGCACAA CCCTGCTGCG 
AGATGCCCTG GACGCCCTGG AAAAAAAGGA TCCGGCCCTT CTTGGTGA33 G3ACCACGTT 
GGCGCTGGA3 ACACTCCTAG GATACGGGTC GGTGCAGGAC TACAAGGAGA CGGTACAGAT 
AATATCCAGC CTTGTGGGCA TCCAAAAATT AGTCA3G3AC CAG3GC3Z3G AC 
CACTGCCGT3 ACAAG3CTAA CTGACCTCAA ATCAACTCC3 GC2ACGACC3 CCATCGAGAC 
GGCTACGAAA CG3AAACTAT ACAGATTGAT CCAAA33GAC CTCAAAGA33 CTCAAAAACA 
CGAGACCAAT CGGGCCATGG AGGAATGGAA GCA3AAAGTA CTGGCTCTTG ACAATGCGTC 
TCCGGAACGT GTCGCCACCC TCCTGCAACA GGC7CCCACC GC3AAGGCTA GAG AG TTTGC 
AGAGAAGCAC TTCAAAATAC TACTCCCCGT ACCCGCGGAC GCCCCCGTCC AA3CGTCTCC 
AACGCCGATG GAATACAGCG CCAGCCCCCT CCC3GACCCA AAG GAT AT AG ACA3AGCTAC 
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ATCCATCCAC GGGGAACAGG CGTGGAAGAA GATACAG C AG GCGT7CAAGG 
CGCCGTCCTG CGGCCCGCTG ACTGGGATGC CCTGGCAGCG GAGTACCAAC GCCGTGGT7C 
GCCCCTTrC G GCGGCCGTGG GTCCAGCGCT CTCAGGGTTC CTGGAGACGA TCCTAGGGA2 
GCTGAACGAC ATCTACATGG ATAAGCTCCG CTCCTTTCTG CCCGACGCGC AGCCTTT7CA 
GGCGCCGCCC TTCGACTGGC TAACG CCGTA TCAGGACCAA GT C AG CTTT7 7CTTGCGCAC 
CATAGGGC7G CCGCTGGTGC GAGCGCTGGC CGACAAGATC AGCGTGCAGG CACTGAGGC7 
TAGCCACGCG CTCCAGTCCG G C G ATTTG C A GCAGGCCACG GTGGGCACGC CCCTGGAGCT 
CCCTGCCACA GAGTACGCGC GCATCGCCTC CAACATGAAG TCCGTGTTCA AC3ACCACGG 
ACTTCAGGTG CGATCAGAGG TCGCGGATTA TGTGGAGGCC CAACGAGCCG ACGCACACAC 
GCCACACGTC CCACGTCCAA AGATACAGGC AC C AAAG ACT CTGATTCCAC ATCCGGACGC 
AATCGTCGCG GACGGACTAC CCGCCTTTCT TAAGACGTCC CTACTG C AG C AAGAGGCCAA 
AC77CTGGCG CTACAGCGGG CGGACTTCGA GTCGCTCGAG AGCGACATGC GCGCCGCAGA 
GGCCCAGAGA AAAGCATCGC GCGAGGAAAC CCAGCGCAAA ATGGCACACG CCATCACTCA 
GCTCTTACAG CAGGCACCCA GTGCGATCTC GGGGCGCCCG CTATCCTTAC AGGACCCGGT 
GGGCTTCCT2 GAGGG CATC A TATACGACAA GGTCCTGGAG CGCGAATCCT ACGAGACGGG 
TCTCGAGGGA CTGTCCTGGC TCGAGCAGAC CATCAAGTCC ATCACCGTAT ACGCTCCCGT 
AGAGGAGAAG CAAAGAATGC ACGTGCTGCT GGACGAGGTG AAAAAGCAGC GAGCAAACAC 
TGAGACCGCT CTCGAGCTAG AGGCCGCGGC TACGCACGGC GACGACGCTA GACTCCTGCA 
GCGAGCGGTC GATGAGCTGT CACCGTTGCG CGTTAAGGGG GGGAAGGCCG CGGTGGAATC 
CTGGCGGCAG AAAATCCAAA CCCTGAAATC CCTGGTACAG GAAGCGGAGC AGGCCGGCCT 
CCTGTTGGCC AC CAT AG AC A CGGTGGCCGG CCAGGCCCAG GAGACCATAT CACCATCCAC 34BG 
ACTCCAGGGA CTGTACCAAC AGGGACAGGA GGCCATGGCG GCCATTAAGC GGT7TAGGGA 3 540 

CTCGCCCCAG CTAGCTGGCC TG2AGGAAAA GCTGGCCGAG C7ACAGCAGT ACGTCAAGTA 
CAAGAAGCAG TATCTGGAAC ACTTTGAGGC CACCCAAAGC GTAGTGTTTA CAGCC77TCC 
GCTCACACAG GAG G TT AC G A TCCCAGCCCT GCATTACGCG GGACCTTTCG ACAACTTGGA 
GCGGCTCTCA CGATACCTAC ACATCGGCCA GACGCAGCCG GCTCCGGGAC AGTGGCTCCT 
GACACTTCCC ACATTCGACC CCACGCGCCC GGCCTGCGTC CCAGCCGGCG GCCACGAACC 
CCCGTTGCAC AGACAGGTGG TGTTCTCCAG CTTTTTGGAG GCCCAGATCC GATTAGCGT7 
GTCCGTAGCG GGCCCCGTGC CTGGACGGGG TCTGCCCGGA A2ACCGCAGA TCCGAAGGGG 
CGTGGAGGCT GCCGCTTGTT TCCTCCACCA GTGGGACGAG A7ATCTCGCC 72C77CCAGA 
GGTACTGGAC ACCTTTTTCC ACAACGCGCC CCTTCCCGCA GAGTCTTCCT C2AATGC7TT 40B0 
CCTGGCCATG TGCGTATTGA CGCACCTTGT CTACCTAGCT GGGCGCGCCG TC77GGGCCC 4140 
ACGGGAGCCG GAGCACGCCG CCCCGGACGC GTACCCAAGG GAGGTGGCGC 7GGCCCCGCG 
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GCC7TCGCAC GCGGAGGCGG CGCACGCATG TCT7GTCACG CTGCCAACAA TGCTCAAGGC 
TGTGCCGTAC CTCACGCTGG AAGCCTCAGC TGGACCACTG CCGGCGGACA TGCGCCACTT 
CGCCACGCCA GAAGCGCGTC TG7TTTTCCC CGCGCGATGG CACCACGTCA ACGTGCAGGA 
GAAACTGTGG CTGCGTAATG ATTTTATGTC GCTGTGTCAC CGTTCCCCGG GGCGCGCGCG 
CATAGCC3TC TTGGTG7GGG CCGTCACTTG CCTAGATCCT GAGGTAATAA GGCAGCTGTC- 
GTCCACCTTG CGGCCCCTTA CTGCGGATGA ATCCGACACG GCTTCTGGAC TGCTGCGGGT 
G CT AGT AG AA ATGGAGTTTG GTCCGCCGCC CAAGACGCCG CGGCGGGAGG CGGTGGCGCC 
CGGCGCAACA CTGCCACCGT ACCCCTACGG CCTTGCCACC GGCGAGCGCC TGGTCGGCCA 
GGCGCAGGAA CGCTCTGGCG GCGCTGGCAA GATGCCGGTG TCCGGGTTTG AG AT AG TTTT 
AGGCGCACTG CTGT7CCG0G CCCCCCTACG CATTTT C AG C ACCGCATCAA C C C AC AG GAT 
CTCAGATTTC GAGGGCGGTT TCCAGATACT GACTCCTCTC CTGGACTGTT GCCCAGATCG 
CGAGCCATTC GCCTCCCTGG CCGCCGCACC ACGAAGGACG GTGCCACTGG GAGACCCGTG 
CGCCAACATT CACACCCCCG AAGAGATACA GATCTTTGCG CGTCAAGGCG CCTGGCTTCA 
ATATACCTT3 GCAAATTACC AGATCCCCAG CACCGACAAC CCGATACCGA TCGTTGTGCT 
AAACGCTAAC AATAACCTTG AAAAC AG CTA CATCGCTCGC GATCGCAAAC- CGGACCCGCT 
ACGACCATTC TATGTAGTCC CTCTGAAGCC GCAGGGTAGA TGGCCTGAAA TAATGACCAC 
AGCAACAACC CCCTGCCGCC TACCGACATC GCCAGAAGAG GCGGGATCAC AGTTCGCCAG 
ACTCCTTCAG AGCCAGGTGA GCGCCACATG GTCTGACATC TTCTCCAGGG TTCCCGAGCG 534 0 

CCTCGCTC32 AATG3G 3 CTC AGAAGAGTTC C C AG ACAATG TCAGAAATCC ACGAGGTCGC 5400 
CGCCACGZCG CCACTCACAA TCACCCCAAA TAAACCGACC GGAACCCCTC A3GTCTCCCC 5460 
GGAGGCTGAT CCAATAACAG AACGCAAACG CG3ACAGCAG CCGAAGATTG TCGCGGACAA 5520 
CATGCCTA3T CGTATTGTCC CGTCGC7ACC GA3CCCGAAA CCCAGAGAGC 3TAGAATCAC 558C 
GCTACCCCAC GCA373CC3G TTATAT3ACC CC2AGCACA7 CGCCCGTCGC CTATA3CGCA 5640 
TCTGCCAGCA CCGCAGGTAA CGGAGCCCAA AGGGGTTCTC CAAAGCAAAC G T G G AA CT CT 5700 
CGTGCTGCGG CCCGCCGCGG TCATTGACCC ACGGAAGCCC GTCTCGGCAC CGATCACGCG 5760 
ATATGAGAGG ACGGCGCTCC AGCC3CCCCG GACTGAGGGC GAAGGCCGGC GCCCTCCCGA 5820 
CACGCAACCC GTCACTTTAA CCTTTCGTCT CCCACCTACC G3ACCCA3T3 CCGCAACTGC 5B80 
AGCCCTAGAA ACCAAAACAA CTCC3CCATC CACGCCCCCA CACGCCATAG AGATTAGCCC 
ACCACAGACA CCTCCCATGT CCACCTCACC TCACGCGAGA GACACAAGCC 3CCCCGCAGA 
AAAGCGGGCC GCACCC3TCA TTCGAGTAAT GGCGCCCACG CAACCGTCGG GAGAGGCAAG 
AGTCAAGCGA GTGGAGATCG AACAGGGCCT TTCCACACGC AATGAAGCC3 CTC3CCTTGA 
ACGCTCGAAT CACGCCGTGC CCGCCGTTAC CCCAAGGCGC ACCGTAGCCC GCGAAATCAG 
GATCCCGC3G GAGATAAAGG CGGGTTGGGA CACTGCACCG 3ACATTCCT3 73C3CCAGAG 

:ggag tcatccccac cgacttcccc ccagcctatc cgcgtggatg ataaatcgcc 
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TCTTCCCAAC CTCGTAGAGA GATACGCGCG GGGT7TCCTG GACACGCCCT C7G7AGAGG7 
GATGTCCCTG GAAAATCAGG ACATCGCCGT GGACCCCGGA CTGCTAACCC GCCGGATTCC 
ATCCGTGGTG CCCATGCCCC ATCCAATTAT GTGGTCACCC ATAGTACCCA TGAGTTTACA 
AAACACAGAC AT AG A C ACTG CAAAGATAAC ACTGATTAGT T7TATTAGAC GCATCAAACA 
AAAAGTGGCC GCCCTATCGG CGTCCCTGGC GGAGACGGTT GACAGAATAA AGAAGTGGTA 
CTTGTGACTC CACGGTTGTC CAATCGTTGC CTATTTCTTT TTGCCAGAGG GGGGTTTCC7 
CGCGTCGGCC ACCGCGGGGG CGGCCGTTTC CGTCGTGGAT GAGAGGGTTG TGAGAATGTC 
TGACGCCGGC GACAA7GAAT GGGGACCAGA GGACAGGGTG GTTATACTGC TTCCCGAGAC 
CCCCAGTGAG TCCTGGCCCC CGGGCGTGGT GCCGGATGCA GGGCCTGGCC TCGAAGGCAC 
GGTGAACGTC CCCGC3TCGT AAGCCGACGC CGCGGAAACT CGGTCAGCGC GCTCGCGCGG 
TTTCTG AT CC CTAAGGGTCT GCAGATGATC CCGCCTTTGA ATTCCACCCA TCCTCCTCAG 
ATAGGCCTCA TAATAATGAT GGG C AATTAA GAACACGAGA TAGTGTCTCT T7TGCACGAG 
GTATTCGGCC TGCGAGATAT TTCCCTGATC CAGGGTATTC ATGCGAGCCA CCAGGGGATG 
GTGAGCGTAG TCATGATCCA GTCGCTCCTG GATCACGGGG TCTCTCACCT TAAAG - * GGA 
CATCTTCCAC ACAGGCGGGC GAAATAGCCT CAGGAGGAAC ACTTCCCGCA ACAGAACTCC 
AGCAGCTGTG AGGTGAGCTG AAGCAGTCCG CGCACGTCAC GGTG CTTTAA TAGGGCAGCC 
7CGCAGTCGG GCGTCCCAAG GCAAGGCACT ACAAAACTGA CAGTTTGATC TA3GTCTCGA 
ATGGCAAGGG CCGCGTTGTT AGCTAGAACA G-CCTGATTA CGACGCGTGC TAGGGTCCCG 
CGTCCGGTAA TATCGCACAG GGGATACACC CTCATATGTT CGCTGCCACA GTAAGAA7AG 
TAGATCCTCC CCGTGGTCGC ACAGATGGTG AACTGCTTCT C7TTCCTGTC CC7GC7GAAA 
AACACGTTGG TGGGAGGAAA ATT G AC AG T A TGAAACTTGC CCCTGCCAAA GTTAAGACAG 
TGTCCACACT CCATGCACAC AACCGCCCGA GCGCAACGCG CCCGCTTGGC AAGGGCCGCG 
CGGGCCACGC GAGAACAGAT GACGGGTATG GACACGCAGG GGGAGAGAAC A7TGTATGCC 
AGAAGCCTCC TGCCAAGGTT CCGCACGAGA CCAGGTCCCT CCTGCTCGCA GGCGGGCAGC 
ACTACGTGGC GGGACTTAAT AAGGCTCAAA AAACACAGTG ACCCAAGCAT GGCGTCGAAC 
GGGTTACCGC AGGGAACCGT AGGGGCGACG CGCTCCAAGG CCTCCCGGAG GCCGGTACCT 
GCCGCCCC7A TCCCGAGCCC GTTACCGTCT TCGGTCGCAG C2ACACCGCG ACGGGTGTGC 
GAGGGCACCT CCAGGAGGGG ACGACGCGGC AACGGCCCAT GCCACTTC7T CCT7ASCCAG 
GGTAGCGACG GTGGGGGCTT CGAACAGCAG GTCAC7AACG GAAAGCGAGA GCAAAGCGCC 
AACAGCT7GC AGAG7TGGGC ACAGGCCTTG GAAAATGGAA GCGACAGGTA 7777GCCCA7 
ACGTGGCGCG GTATCGCCCT AG CA7GGTCG GCGGCCTGGG CACGGGACAG CGT=ACCACA 
ACCCATACGT GGGCGCCAAG CAGC7GC7GC G C CG CAC AAA 7C7GCGCC7G T7TGGCGACG 
G7GTCTGAGC CAGCGCGCAA CACGGCGATC GCCTGCGCCA G7GACGGGCG 
TGCC7GGC 
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ACATATTCCT CCACTGGCAC GCCATTCTCG CCCTCGAACA CGCGGTGGGC CGTCAGCTGC- 
GCCTCATCCA AACCAAACCA AGACACAAGA AAGCGATCCC AGCGCTGATC CAGGGCCA7G 
ACCTTCTCAC CAGCGCGACC GCACGGCCTA AGCTCCACTG AAAGGCGCCC AGAATCCGCA 
CCGTCCTACC CCCCTGGCCC GCCCAATATA CCGCTGTGAC GTCTGATGTA CAGGCCCGCG 
QGTCGCGGCC GTTGGTGGGA AAACCGGCAC CACCCTGTGC GGCCGAATCC GCCACGGGGG 
CTGCCAGACA GTACACTGTC TCCAGCAGCG ACTTCAGTCT CTTGTGACTT TTGGGCGTCA 
CCACCAAAAA TTGCAAAACC TGCCTGTAGT CCGTGAAGTA GGTACGGCAT ATT A C CAT G G 
AGTTGTACAC GCCCAGGTTC TTTGAGAACA CCAGGCTCGC CTTGAACTTT G TAAAG T CAT 
CCTGCCCCAG CACGACAGAC GTATTTTTGG CAAGGTATAC GTCCGACTCC ACGGGAAGGA 
CGTGCCCAAA CTGGGACACG GCGTCGCTTG GTCGGCACAG AAAGCACTTC AGGGTTGTGG 
AAAGG C C ATT ATTCGATATA ACAAAGCAGG GAGAGAACGG GTAGTGCATC TCCTCCAGGA 
GGTGCGCCCA AAACTTATAC ACAAACTCTA AGTGGTACAC GCAACCGTGC TGCATTCTAA 
CCGTACATAT GGCGGTAGCA CCGCCCTTAG CATAAACTGG GGCCCCGTCG ATGCACC3TT 
CCAAATCCAG GGACTGACCA GACTGTCCCA AGTATGAGGA TACCACCCGA CACAGTTCGT 
CCACTACACG CTTAC CAACG ACACTCATGG CGACAGCGGG GTGGGGCTGG CAAGGCCCCC 
AAAGCGCGAC ACCCGCAGTC AATCAGGGCC GTGCCCGCGC CTCGGAGAAT ACGGCGTCCG 
TGCTCACGAT CT7GCGCAGG ACCTGCCTTA CCGTGTCCAC CTTGCTCTCC AA C A C C AG AG 
TAT GAT C G C A GGCTGCAGGC TGTGCCCGCT GGACGAGAAA GGTTTTTAAA TACTGACAGT 
AGTTGATGGC GTTCAATCTA CAATAGATCG TGGGAAATAA AATTTGCATG TCACGAGGCA 
GAAGCTGGTC AGACGCGTAC TCCATGTTGG GTTCCACGGG GAGGGGAACA CACGCCCCAA 
GACACGACGG CGCACATAGG GAGCGGAGCA AACAATTGAT TCAAATATTT GACTCCGCAG 
CGAGCCGGTT TGCAGAGTGG - TCACCTGCCC TGCTCCACAC CCACCCCCGC GTCTCTTCCA 
ACTCTCAACT CACGATCCAG GGAAACCACC GTCCAGTGGC CATGTTTGTT CCCTGGCAAC 
TCGGTACAAT TACCCGTCAC CGAGATGAGC TCCAAAAACT ACTGGCAGCC TCCCTGCTCC 
CGGAGCACCC GGAGGAGAGC CTCGGTAACC CCATAATGAC ACAGATTCAC CAGTCGCTCC 
AAC CAT CTT C CCCCTGCAGG GTCTGTCAGC TCCTATTTTC TCTGGTCCGC GATTCGTCCA 
CCCCCATGGG TTTCTTCGAG GACTATGCCT GCCTCTGCTT CTTCTGTCTA TACGCCCCAC 
ACTGCTGGAC CTCGACCATG GC3GCAGCGG CAGACCTGTG CGAGATCATG CAT CTG CACT 
TTCCAGAAGA GGAGGCGACA TACGGGCTAT TCGGACCGGG TCGCCTTATG GGTATCGACT 
TGCAGCTGCA CTTCTTTGTT CAAAAGTGCT TTAAGACCAC CGCCGCCGAA AAAATACTGG 
GAATATCCAA CCTGCAATTT TTAAAATCAG AATTCATCCG GGGCATGCTC ACAGGCACCA 
TCACCTGCAA CTTCTGCTTC AAAACGTCCT GGCCCAGGAC AGACAAGGAG GAGGCCACCG 
GCCCCACCCC ATGCTGCCAG ATT AC AG A C A CCACCACCGC ACCCGCGAGC GGCATACCGC- 
AACTAGCCCG GG C C AC ATT C TGCGGCGCAA GTCGCCCCAC AAAGCCCAGC CTACTTCCCG I03 8C 
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CGCTAATAGA TATCTGG7CC ACGAGCTCAG AGCTCCTTGA CGAGCCGCGC CCTCGACT3A 
TCGCAAGCGA CATGAGTGAA CTCAAATCCG TGGTCGCATC CCACGATCCG TTCTTCTC7C 
CCCCGCT7CA GGC AG ACACC T CACAGGGTC CATGTCTGAT GCACCCAACC CTGGGGCTAC 
GATACAAAAA CGGGACTGCA TCCGTCTGCC 7CCTCTGCGA GTGCCTTGCG GCACACCCAG 
AGGCACCCAA GGCGCTGCAG ACCCTTCAGT GCGAGGTAAT GGGCCATATA GAAAACAACG 
TAAAGCTGGT AGACAGAATT GCCTTTGTGT TGGACAACCC ATTCGCCATG CCATATGTAT 
CAGATC23CT A CTT AG AG AG CTGATCCGGG GCTGTACCCC AC AG G AAATT CACAAGCACC 
TGTTCTGCGA CCCGCTGTGC GCCCTCAATG CTAAGGTGGT GTCAGAGGAC GTACTAT7CC 
GCCTGCCCAG GGAGCAGGAG T AT AAAAAG C TCAGGGCATC CGCGGCCGCC GGACAGCTCC 
TCGATGCCAA CACCCTGTTC GACTGCGAGG TCGTGCAGAC TTTGGTCTTT CTCTTTAAGG 
GTCTCCAAAA CGCCAGGGTG GGG AAAAC C A CCTCACTAGA CATTATTCGG GAG CTAAC CG 
CACAACTAAA AAGACACCGC CTAGACCTGG CCCACCCCTC ACAGACGTCA CACTTGTACG 
CTTGAGCTGG TCCCGGGCCT TCGCACCCCA TCCACCGATG CCGAAATCAG TGTCCAGCCA 
CATCAGCTTG GCGACCTCAA CCGGTCGCAG TGGACCGCGA GACATCAGAA GATGCTTGTC 
ATCCCGCCTG CGGTCGGTCC CGCCCGGGGC GCGAAGCGCC AGCGTCAGCA GCAAGCACAG 
AAACGGCCTT CGCAAGTTTA TCTCAGACAA GG T ATTTTTT AGCATCCTAT C G C A C AG AC A 
CGAG CTAGGA GTGGACTTTC TCCGTGAGAT GGAGACCCCG A7ATGCACCT CCAAAACAGT 
AATGCTG CCC CTAGACCTGT CTACCGTCGC ACCCGGCCGC TGCGTCTCCC 7CTCTCCGTT 1146C 
TGGACACTCC T2AAACATGG GGTTCCAGTG CGCTCTGTGC CCATCCACAG AAAATCCCAC 11520 
CGTTGC2CAA GGCTCCCGGC CTCAGACAAT GGTGGGCGAT GCGCTCAAAA AAAATAACGA 
GCTATGCTCG GTAGCGCTGG CCTTTTATCA CCACGCAGAC AAAGTGATCC AACACAAGAC 
GTTTTACCTA TCACTCCTCA GTCACTCCAT GGATGTGGTT CGG CAGAGCT TCCTGCAGCC 
TGGTCTACTG TACGCTAACC TGGTCCTAAA AACCTTTGGG CACGATCCCC TACCCATG7T 
CACTACCAAC AACGGCATGC TAACAATGTG CATCC7TTTT AAAACCCGGG CACTACATCT 
GGGAGAAACT G CG CTT AG G C TGCTTATGGA TAACC7CCCC AACTACAAGA TATCGGCGGA 
CTGCTGCAGA C AG T C CT AC G TGGTCAAGTT TGTCCCAACG CACCCGGACA CCGCAAGGAT 
TGCAGTGCAG GTACACACCA TATGCGAAGC GGTTGCGGCG CTAGACTGCA CCGACGAGAT 
GCGGGATGAC ATTCAAAAGG GAACCGCACT TGTCAACGCC CTATAACCTC ACATGTAGCC 12 06 0 
TGTCACCCCA GCTCCTATTG CAACTGACCA TGTTCAGGTG GTAATAAAGT C ATT AAA CGA 1212 0 
CAAAGTGATT CTTTTAATCT GTTTATTGTT TTTGAACATG TGGCACACGC TGCAATGTAC 
TGCCATGAAA GGTGGTTCTA TATCCACCAC TTGG2GTCTG CCGAAGTCAG T3CCACAATT 
TCATTAACAA ACAAGGTCAA TACATTGTGA GGGAGTGTTT TTTGCCATGG TACCATT2GT 
GTGGTTTGGG AGAGCGGACG CCATTTGCGT GCAAAATGTG CTTTGCTGGA GGCCAACTTC 
CGTCGCGCTG GTTGATGCGC GGCACATTGT GTCAACCAGG G2ACCCTCC2 CCACCGAGTG 
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CTTTAATGCG GAGAGGAATG GTGGCCTGGT TGACACCGCG TGCCGGCCAT CTGAACTG7G 
ACTGTGTTAT GAGCCACGGG TATGCCCTCG ATACGCCTGC T CTTCAG CAT TGTATGTGTT 
TAATGTTGTG CTTGGTGCAA CCGTGATTGT GTTTTTGTAT T7TATTTTAC TGACACT CT7 
TGGGAGGGCA CGCTAGCTTC AGTGCGCGCC CGTTGCAACT CG7GTCCTGA ATGCTACGG-3 
GCCACGCTGG CCACTCGGGG GGACAACACT AATCGCCAAC AGACAAACGA GTGGTGGTA7 
CGCCCCAAGC CTCCAGCGCC ACCCATTTAG TAACACATCC GGGACATGAA CTGCCACAAA 
CACCGTTAAG CCTCTATCCA TGCATTGGGA TTGGAGTGAG GAGGGAGGAG GGCACCAGGT 
TCCCGG3GAG GAGGGCACCA GGTTCCCGGG GAGGAGGGCA CCAGGTTCCC G G G GAG GAG C- 
GCACCAGGTT CCCGGGGAGG AG G G C AC CAG GTTCCCGGGG AGGAGGGCAC CAGGTTCCCG 12960 
GGGAGGAGGG CACCAGGTTC CCGGGGAGGA GGGCACCAGG TTCCCGGGGA GGAGGGCACC 
AGGTTCCCGG GGAGGAGGGC ACCAGGTTCC CGGGGAGGAG GGCACCAGGT TCCCGGGGAG 
GAGGGCACCA GGTTCCCGGG GAGGAGGGCA CCAGGTTCCC GGGGAGGAGG CTGGGGTGCG 
CCGCGCCGGG TTCCTGGGGT GCGCCGCGCC GGGTTCCTGG GGTGCGCCGC GCCGGGTTCC 
TGGGGTGCGC CGCGCCGGGT TCCTGGGGTG CGCCGCGCCG GGTTCCTGGG GTGCGCCGCG 
CCGGGTTCCT GGGGTGCGCC GCGCCGGGTT CCTGGGGTGC GGGGTGCGGG GGACCGCGCC 13320 
GGGGTACTGC AGGGTTCGCA GGGTTCGGGG GTACTACCTG GTTTCCTGGG GTGTGCCAGG 
ACGGGTTCCT GGGGTGCCAC CGCTCCTCGA TACGTGTAAA TCCAAGAGAT CCGTCCTCCG 
TGCCGCCGCG CGCGTAATGC GCGAGGGGGG TCGGTCTCCC CTCTTCTTTA TAGCGTTTCC 
TGCGAAGGGG GCGTAACCGT AGGACAAACT GCTTATGTAG GGGTTAGCCA CCCATTTCCC 
GGGGCCGCGC CAGAGGTGAG CGTGGACCTA GCATCCCGCT CCCATTTACC GAAACCACCC 
AGAGGCGAGA TTCCAGGGCC GTGACTCACT AGCTCCCCTC CCATCGAACA ACCACGCTTG 
GCTAACACGG CTGGAGTGGC GGTGGGCGGG GCCCCTATAA TCCTGGCCCC CATCTACTGA 
AACGACCCAG TAGAAAAATC CCAACCCCAT G ACT CAT CAG GCCCTATTAT ATAGAATATC 
C CAG TAG AG T GACCCAGCTG GTTTCCATAA ATGGATATAC TTCCGGAAAA CGAAGGAGGG 
TTGAATACAG TTGGGGGTAG TCCGCTGGTA TTCCCAGCTG AGGTTGCCTT ATTTGGTAAT 
GCTTCCGGAA ATACCACCTG AG T AC C C CAT TGGTTTATAC CTTGTTTAAT T G T AG AATT A 
CAGCTGGATT TACCCAGCCG GGTTTACGCA GCTGCGTATA CCCAGCTGTG TTTACGCAGC 
GGGGTTTACG CAGCTGGGTA GACCCAGCTG GGTATACCTA CTGGAATAGG GGCTGCGATG 
ACTCAGCTGC G CT AG G ATT A AAGGATTATA TATATATATA TAGGAAAAAT CAAAACAAAA 
CTCTAATCGC TGATTGGTTC CCGCTCTGGG CCAATCAGCT TGGGAGTTCT AGGGATAGGG 14220 
GCCAATGGGA GGCCTCCGAA TTTGATTGAC GGCTGGGGCG TCCAATGGAA TGGCGCGGTC 
GCCTAGCTCG AACGGGATTG GTCGGCCGGA TGGGCCAATG GCGGCTCGGA AAA CTTT G AT 
TGACGGGCCG GCGGACCAAT GGGAGCGGGG CAG AG G ATT A TGGGGGATTA GC 
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CTAGCGGCGT GACCTGGAGG TGACCCCGTG CACCCGGGCG CTCTGAATT: 

gcgcgactcc tcatctacat aatttatgca cataaaagga ttagcgcatg caaattagtc 

AGATAGCAGG GCCATCCACA CTTTATGTTG GCCGCGTGCC AGGCGCCGGC GTGGGCGCCG 
CGCGCGTGCT CTCTCAGTCG CGCCTAGCTG CTTCCAACAG ACAAAAGCGG GGCGTTAGTG 
AGGGAGTGCG CGCGCTGCGC TGACTTGGCC GATTTCCAGT GCATGCTTTG TCACCCCAGC 
GCGAGAATGG AATTTTCATT ATTGAGCAAT TTGGGCACCC TGGGCACGAT AACCATACAT 
GGATACACGG GTTCCAAATA TGCAAAGTAG ACACTAAGGT AC C ATT TG G C ATATTTGGAC 
GTCCTGGGCA GGTTAGCTAC CCACCAGAAT ATATGGGACT CTGGGCAGGA TAGCCACCCA 
CAATTGTTTT GCGCCCCTCT TTGGCCAGGG GACCAAGGTC G7ATGGTTCG CGCTACACTA 
AGCCCGAACG TTCAGCTTTG CGTGCTTTCG ACGTCCAGGC GGCTGGCACA CGGGCCGTGA 
GCGCCAGCAA CATGGGATCA TGGTAGTAAG AT ACAG CAT A AATCCCCGTC CGGTGGCGCT 
CAACGCCAAT ATGCGCGGCT GCGTGGTATC TCATCGGTGG GCACGCGTAC GGTGGTCTCA 
TGGGTATTGG ACTTGTAGGC GAGGGGAGGC GCATACGACA AAAATTGCCG CCGTGAAGGT 
CGGGAACCCG CCCGCGCTTC CGCAAGGCAC GGGGCCGCAT CGGACACAGG CTAAG C ATT A 
AG GAT CAT AA CACCGCCCTA GAAATGTTTA AGCTGTGACC AAAGCGAACC TCGCATGAGG 
CATACGCGAG CGTGGAGGTA GGATTCCCAA GGCTATTGAG AGACGGTGGG TGAAATGATG 
AAGAACACAC AGAACAATAA CGGGCGACTA GATAAAAAGA CTCGCTCAAC AGCCCGAAAA 
CCATCAGCCC GACCGCCGAT GGATTAGGTG CTGCTGGACA AGTCTTTCTA AACCCGCGCA 
GGGTTTGTGT CGATCCAGAC GCTTACGAAC GCC2GCTTTA AAAACACTAT T C ATAATT AA 
CAGAAGTTGA CACCAGCCCG CAGTTACCCA ACCTTCTATT TTTTTGGAGT GTTGACAAGT 
TTCCATCGCC CGTTTGGCGT TTCCCGCATG GTGTCAAATT AGTGACGCAC CCTCCCCCCG 15720 
TCACTATGGG TTTACCCTGA TTTAGTAAGT AAAACTGCCG CCCCCGCCCA CTCATTTTTT 1S7B0 
TACCCTGTTA TTTG CTGTAT TTACATCTAC GGACCCCCTT TTGGTGAGAT TGCCGTGGTT 15840 
CTAAATAACG TTGTGGTTTT CGGACCCTTT CAGGGACCAA ATCTTTTACG TGTTGCCAAG 
GTAG CATTTG CTGGACCCGC AT AG G TTTTT GTGGCACCAG GTTATGGTCT TATGAGCGGG 
CTTGACCGGC AAGTTCCAGG CATCCTAAGT GCTTGATGTA GACCCTTAGG GCACCAGGGA 
CTACCTAGGT CAAACTCCCC CTTAGTCATG ACGCCGTGCC CACGAGGTTT GAGAGGCGTA 16 06 0 
GACATCCGTG TCGACTGCTG GACGGAGGTA GTATAATCAG CTAGGCCTCA GTATTCTATG 16 14 0 
TAACAAATGA ATGCCCTAGA GTACTGCGGT TTAGCTAGTT ATACTGCCCG GTTCCACCAG 
GCGGCGTTGT GGCCACGGGC GGTTCGTCGC TTGGACCTGG AGGGGTGTCA CATTCTGTGA 
CCGCGACGTT GACGTTAGAC ACACGTCGCT GCCGTCCTCA GAATGTGATA GCCCATCACA 16320 
GGCATTGTAG CTGTTGCGTT GGTTGGGAGT TTGGGGACCA AATTTCTATA ATTG3TGTCA 
CCGCGGCAGC TCTAGCCCTG GAAGATCTGG AAGCTTGCTT CAATGGCTCA GATCGACCCG 
GACTACAGTT AGCGAAGTAG A 2 C 2 ATT AT A ATCTTAATCT TAAATCTG3T T3ACGGA2TT 



1452T 
145S: 
14 64C 
14 7 0 0 
14 76 0 
1482G 
14890 
1494 C 
15000 
15060 
15120 
1518C 
1524C 
1530C 
15360 
15420 
15480 
15540 
15600 
1566C 



15900 
15960 
16020 



16200 
16260 



163B0 
1644 0 
16500 



Id 8 6 C 
16920 



PCT/US97/13346 

WO 98/04576 

156 

TCGCGCCGGG AACACGCAGG TGGCAGCGGA TGTGTTTTGC CCAAACACGA GGGT7GCAGG 1656: 
AAACAGGTGC TGCCGGGGAT TATGTACAGC TTACACCCAG TTTCCTGTAA TGGCCCGCA7 166 2: 
CCGGCCGTCC T GG G C AG C AC CGCACCCTGC GTAAACAACC GC3TACTTT7 TCCTCCTZZ2 1666C 
CCCACCCCCA CATCCTTCCT CCCACCCTGC CAGTCCAACC CGCTTCCTGT TTTATTCG C Z 1674C 
TTCAAACAGA AG C AC G C ATT CTAATGATTC TTACAAAACT 7GTTAG7GT7 TATTAAATCA 16 8 GO 
GATACATACA TTCTACGGAC- CAAAAATTAG CAACAG CTTG TTATCTATGG TGTATGGCGA 
TAGTGTTGGG AGTGTGATGG GCCGGAAAGG TGAAGGCCCA TTAGGGTTTG CACTTGGCGC 
TGTAGG7C7A CTCTTGACAA AGATCTAAGC ATTGACATTA GGGCATCCAC GTCAGTGGGA 16 990 
CCCAGTAGGT CTAAGTTTTC CATACAGTAC ACCCAGTGTA AGATGTCTGT GGTGTGC7GC 17 040 
GAGACCCTAT AGTGTCCTTG CTTAAAAATA TCAAAGACCT AATATCCCTC GCACACAGC7 
CCCCGTCTAC GTGGAGAACA GTGAGCTGAT AAGGG CTGAA ATAACTCATT G7GCCCGGTA 
GGTGGCGCTC TAAAAAACGC GGGTCTAAGT GAAGCAGGTC GCGCAAGAGG TCTCTGCGAC 
CTGCACGAAA CAGACATTCC G CTAACAGG G GAAACGTTAA CCTGCCCTCC TCCTT7AAAG 
CTCTAAGAGC TCCAATTAAT TGGGCCAGTG TGGGTTGAGG TATGAACACG TTTAGGAGGA 
ACAATACCAC TTCCCTGTCA TCCGTGCCCA GTTTCCGCGC CACCTCACAG AGAACCTCGT 
AAGTGGCCA7 GGTGCCGGCT TGTATATGTG AAGGCACCGA TGTGGAAAAA CAAAGGAAAA 
TTTAT7777C CGCCCTAAAC AAAATCACAA GCTTAATAGC TGTCCAGAAT GCGCA3ATCA 
AAG7CCGAAA CAGATGTTAG GATCTGTTCC ACTGCCGCCT G7AGAACGGA AACA7CGCA7 17580 
CCCAA7A7GC 77GCCAGC7G AGGAACTAC Z CCACCCGAGT GGG7ATCC7G CGGAATGACC- 
T7GGCAGGAA CCAACAGCGC ACAGCC7GCA GCGC7GA7AA TAGAGGCGG3 CAATGAGGCA 
G7C777GGG7 CAACTAAGGC 7777G7AATC AGGG7G7TGA CC7CG7GG7G C C AAAAGT Z Z 
AGGTG7TGGG AGCCCCCCAG CAA777AAG7 AACAAGAAGG AAG7GACG7C CGTCG CTAAG 
ACTGCG7C7G T7CGCCACGC CAAC77C7CA AGGAG77C77 7GTCC7GG7C 7A7AAG77C7 178 BO 
7GGCGGGAAA AGGAG7C7GC CGCGGCA7AG CAAAG7GAAC 7GG7AGAAA7 AGGCG7GAGG 1794 0 
CTTC7GAGG7 7AC7GGCCAC 7AACAGGCAG GCGC7CCCTG 7C7777GAAA G7G7TCT77G 
G ACAC CTG 27 77A7AAG7AG GAG7C7GTCC AAAAGA7TAA GGGCCAACGC GACCACG77A 
GGT7C7AGG7 7GTA772C7G GCAAAC7GAA AACA7CCA7G 7G Z Z 2AG7AA C77ACG CAT A 
TGCGAAG7AA GAGA77G7TG AAAGG7CCCA AA7AGAGAG7 2AGAAG77AA AAAGCGCGGC 
7CAATT7CAA GAA7A77G7A AAAGA7CCGA 7CCTCACATA GCGTGGGA72 CAGAAG7CC2 
GAGGGCGGG7 7A7TGGGAG7 TGCCATATAG AGTGGCGAGC G7A7G7GG2C 7ACC7G7AGA 
GCCTGGAG77 7CAGGG7GC7 C7G7CAGG7T C7CCCA7CGA CGACGC7GGC- CCGCGAGAG7 
ACG2TAG2CG 77G722G7G7 G77CAG77GA GG7AGATGGG 7ZG7GAGAAC AC7GCCCCCG 
ACACACACCA GCACGCA7GG CGCCAAA7GC AAGTGCGGAG 2GGCGACGG7 GGGTTC7AGG 
GAGGAAAAAG GGGGAGAGG7 GTGGC7TT7A 7G7~AT77CC 7G73GAGAG7 ZZZ: 
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TTGGTTTTCC CCTGGCTGGG TTAATGGCAG GGGCTT7TTA AACT7AACTA TGGAAGA7TG 
TAGGTTTCCT GCCAGGGGGT GACTAGCTTC CCAGGCTAGG CGGGCCAT7T GTACTTTCT7 
ACTTGTGTCT TTGTTCTGAC AATACACATA TACACAATAA GTTATGGGCG ACTGGTCTGG 
TCCAGGGTGG GGCAAGCAGG ACACGGGGCC TGCCTTTACT CCTCCAAACT GGAAGGCCTG 
AGATAATTTT TTAAGTCCGT ATGGGTCATT GCCCCAAAAA ATCACTGCAA ACTTCCATTG 
ACACTTTGGA TCTCGTCTTC CATCCTTTCC CAAAAAGCGT CTATAAAAGA TGTGTTGTGG 
CCTAGCTTTC GCAGGACAAT CATCTATCTG TCTGTAAGGG ACCGGTGGTT GTTGGTATCT 
TGGATGTGGC TTTTTTGGGT GGGTAACTGG AACGCGCCTC ATACGAACTC GAGGTCTGTG 
GGGTGGTGAT GTTCTGAGTA CAT AG CGGTA TTCGCGAGAT GGGCCAGGTT GTGGGTCATC 
GTCTGGTGTA TTATC7CCTG GTGGGCTACT GGCAATTTGT TCATGTGTGC TAACAACAGG 
GTAATCCAC7 TCCATTTCGT CCTCGGATGA CGACCCGTGC AAGATTATGG GCTCTTCCAC 
CGTCTCCTGC TCCTGCTGTT CCACCCCCTG CTGCTCCTGC TCTTCCACCT CCTCTAACTC 
CTGCTGCTCC TGCTCTTCCA CCTCCTCTAA CTCCTGCTCT TCCTGCTGTT CCACCTCCTC 
TAACTCCTGC TCTTCCTGCT CTTCCACCTC CTCTAACTCC TGCTCCTCCT GCTCCTCCTG 
CTCCTGCTCT TGCTCCTCCA CCTCCTCTAA TTCCTGCTCT TCCTGCTCCT GCTCTTGCTC 
TTCCACCTCC TGCTCTTGCT CTTCCACCTC CTGCTCCTCT AACTCCTGCT CC 
TAACTCCTGC TCCTGCTCCT CTAACTCCTG CTCCTGCTCC TCTAACTCCT GCTC 
CTCTAACTCC TGCTCCTGCT CCTCTAACTC CTGCTCCTGC TCCTCTAACT CCTGCTCCTG 
CTCCTCTAAC TCCTGCTCCT GCTCCTCTAA CTCCTGCTCC TGATCCTCTA ACTCCTGCTC 
CTGCTCCTCT AACTCCTGCT CCTGCTCCTG CTGCTGCTCC TGCTCCTCCT GCTGCTCCTG 
„ rc _ r , CTGC TGCTGCTGCr CATCCTGCTG C7GC7GCTCA TCCTGCTCCT GCTGCTCATC 
CTGCTCCTGC TGCTCATCCT GCTGCTGCTG CTCATCCTGC TGCTGCTCAT CCTGCTGCTC 
CTGCTCATCC TGCTCCTCCT GCTCATCCTG CTGCTCCTGC TCATCCTGCT GCTGCTCATC 19920 
CTGCTCCTGC TCATCCTGCT GCTGCTCATC CTGCTCCTGC TCATCCTGCT GCTGCTCATC 19980 
CTGCTCCTGC TCATCCTGCT GCTGCTCATC CTGCTCCTGC TCATCCTGCT GCTGCTCATC 20040 
CTGCTCCTGC TCATCCTGCT GCTGCTCATC CTGCTCCTGC TCATCCTGCT GCTGCTCATC 20100 
CTGCTCCTGC TCATCCTGCT GCTGCTCATC CTGCTCCTGC TCATCCTGCT GCTGCTCATC 
CTGCTCCTGC TCATCCTGCT GCTGCTCATC CTGCTGCTCC TCATCCTGCT GCTGTGGCTC 
CCGCTGCTCT GGCTCCCGCT GCTGTGGCTC CCGCTGCTGT GGCTCCCGCT GCTGTGGCTC 
CCGCTGCTGT GGCTCCCGCT GCTGTGGCTC CCGCTGCTGG GGCTCCCGCT GCTGTGGCTC 
CCGCTGCTGT GGCTCCTGCT GCTGTGGCTC CTGCTCCTCT GGCTCCTGCT GCTGTGGCTC 
CTCCTGCTCT GGCTCCTGCT GCTGTGGCTC CTGCTCCTCT GGCTCCTGCT GCTGTGGCTC 20460 
CTCCTGCTCT GGCTCCTGCT GTTGTGGCTC CTGCTGTTGT GGCTCCTGCA GGGGCTCCTG 
CTGCTGTGGC TCCTGCTGTT GTGGCTCCTG CAGGGGCTCC TCCTGCTCTG GCTCCTCCTG 
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CTGTGGC7CC TGCTGTTGTG GCTCC7GCAG GGGCTCCTGC TGCTGTGGCT CC7GCTGC7G 
7GGCTCCTGC TGTTGTGGCT CCTGCTGCTG TTGTGAACTT TGGATGCTCA ACGTTTTGT7 
7CCATCGCCC CCGTCCTCCT CGTCCTCCTT C7TGTCCTCC 7CC7CGTCAT CC7CCTCGTC 
CTCATTGTCC TC AT CAT C G T CATCCTCCTC GTCCTCCTCC 7CC7CG7CC7 CCTCCTCGTC 
CTCCTCCTCG TCCTCCTCCT CGTCATCCTC CTCGTCATCC TCCTCGTCAT CCTCCTCGTC 
ATCCTCCTCG TCATCCTCCT CGTCATCCTC CTCGTCATCC TCCTCGTCAT CCTCCTCGTC 
ATCCTCCTCG TCATCCTCCT CGTCATCCTC CTCGTCCTCC TCATCTGTCT CCTGCTCC7C 
CTCATCATCC TTATTGTCAT TGTCATCCTT GTCAACCTGA CTTTCCTTGC TAATCTCGTT 
GTCCCCAT7A TCCTCGCCAG C C TG ATT ATT TTCGGAACAT TCTTTTTCAT 7C7TGGATGC 
TTCTTCTGCA ATCTCCGCAA GGAGCACCAA CATGGCTGTG TCATCACCCC AGGATCCCTC 21180 
AGACGGGGAT GATGATCCTA TGGAGATGGG AGATGTAGGC GGTTGGCGTG GCGGAGTATC 21240 
GCCATCGCTG GATGATCCCA CGTAGATCGG GGACTCTGTG GCCCATGGGG GGTACACAC7 
ACGGTTGGCG AAGTCACATC TAGGGGGAGA GACTGGGGGC G ACTG A CAT A 7TGGGTT7AG 
TGTAGAGGGA CCTTGGGGGG ACGATAGCCT TC77TTTCTC AGGCTACGCA GGGTAGACGG 
AG CTAAAG AG TCTGGTGACG ACTTGGAGGG AGGCTCGGGT GGAGGAGTCG 7GGGTGAG7G 
TGGAGGTGTA GTCTGCTGCG AGGG7GGCGG ACGCATAGGT G7TGAAGAGT CTGGCCTTCC 
7GTAGGAC77 GAAAGCGGTG GCCTT7GAGA AGACTCTGGA GACTGCGTGG GTGGCAATGC 
AGGAGATGGA GAATGAGTAT CCGTGGTCCC CGGAGACACA GGATGGGATG GAGGGA77GG 
GGAGGAAGAC GTGG77ACGG GGGGTAAGAG 7GCCGGTGGA GG7AAAGG7G 77GCGGGAGC 
GGGTGAAGGA ATGGGAGCCA CCGGTAAAGT AGGAC7AGAC ACAAATG CTG GCAGCCCGGA 
TGTGAACAC7 GTGGGACTTC C AG G 7 AT AG G CAAGGTGTGG GG7CCACA77 CCCGGCCGTC 
GATGGAGTCG GCGACATGCT TCCTTCGCGG 77G7AGATGT AGGTCATCGC CAAGGTCACA 
ICGGA GACC7G777C GTT7CC7ACA AC77CCTC7C G77AAGGGCG CGCCGG7GC7 
^C CTCAGGCGCA TTCCCGGGGG CGCCATCCTC GGGAAATC7G G7C7GACAAC 
CAAAGTAAAA TTATGGAGGC GGTGGCAGTA TATTCACAT7 A7GCAA7ACC CG7AGTGACC 
ACAAGGGGGA GCTCTCAGAC AA77AAGCGG 7TACACACAG 7AGCAGGC7G CAG7ACCGCC 
CATGGCCACA GGATGTAGAT CGCAGACACT GAAACGC7GA AACACAG CA7 7 AAG CTG C AA 
TACCGCCGA7 G G C C AC C AG A 7GGCACGCGC C G C C AG C AAA 777AAGTCC7 GG7GGC7CAC 
CTGCCAGGTA AACAAGGT7A AAGTGGGTTT GC7GGCC7TG CG77GCCATG GATGC7ACCT 
AGGCAAGTCC AGATATATAA TCCGGGCGTG AGAAACAGAA ACGGCCAATA ACCCATG7T7 
7TCGAAAACC ACCACACACC TTAACACAAA TCA7GTACAC C7GG7A77AC 7A77TCCCAC 22440 
ACATC77A7A G CAT TT C AAA GAT AAG G G TG CC77ACGGGC CGCCCGAAAC AAG7GGGCGG 22 50G 
GCGC7AC7CA CTGT77A7AA GTCAGCCGGA CCAAGCTGC7 GCTCTTGGGG ACG7GAC7GC 
T7CG7GGCGC AGCTGCC7CC AAA7GATACA CACA7TTTT7 GATTG7CCCG GSC3CCGCGT 
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AGTGGAGGGC GGAGTTATAT CAAGCTACTT TCTGATTGGT GCCCCAGGCA GGACTGC2A7 
AAAAACTGAA GAAGGCGTGT CTGCTTTGCA GAATTTACCC CCCACTGTGC TCCCGGTTGC 
TGGCACCGGT TCAGTGGTCC GACCTGTCGT CTGTGCTCCC CCGTGGACGA CGCCGAG7G2 
CTCTCGGGGG TCCATGTCTA GCCTCTTCAT TT C ATT AC CT TGGGTGGCG7 TCATCTGGC7 
AGCCCTCCTT GGCGCGGTTG GGGGTGCCCG CGTTCAGGGG CCCATGCGGG GCTCTGCTGC 
CCTCACCTGC GCCATCACGC CCCGTGCTGA CATAGTTAGC GTTACCTGGC AAAAAAGGCA 
GCTCCCCGGT CCCGTAAACG TCGCCACGTA C AG C C ATT CA TATGGGGTGG TGGTTCAGAC 
CCAGTACCGC CACAAGGCAA A7ATAACCTG TCCTGGGCTT TGGAACTCTA CCCTTGT7A7 
CCATAACCT7 GCAGTGGATG ATGAGGGCTG TTACCTGTGT ATCTTTAACT CATTTGGTG3 2316 0 
CCGGCAGGTG TCATGCACAG CCTGCCTGGA AG TGACATCT CCCCCTACTG GACACGTGCA 23220 
GGTAAATAGC ACAGAAGACG CAGACACCGT CACCTGTTTG GCAACTGGTC GCCCACCCCC 
CAATGTCACC TGGGCCGCAC CCTGGAACAA CGCCTCTTCT ACCCAGGAGC AGTTCACTGA 
CAGTGATGG7 CTTACAGTTG CGTGGAGGAC CGTGAGGCTG CCGCGTGGGG ATAATACCAC 
CCCAAGTGAG GGAATATGTC TCATCACCTG GGGAAATGAG AG C AT AT C AA TCCCGGCTTC 
T ATT C AAG G C C2CTTGGCCC ATGACCTTCC CGCGGCCCAG GGAACTCTTG CCGGGGTTGC 
CATTACTCTG GTGG3CCTAT TTGGGATATT CGCATTACAT CATTGCCGCC G2AAG2AGGG 
CGGTGCATCA CCTACTTCAG ATGACATGGA CCCCCTATCC ACCCAGTGAC TAGATGGACA 
CCCCGTGAA2 CGTCGTGCTT ACCCACCCCC TTCTGATTCT GACAGACAAC ACTACTATGT 
CC CAAAGACT GTTTTTTACA GCCCGATGGC CCTTCAGGCC TCCTTGAGTG T2TAGCTGG7 
CCCGTGGTCA TTGTGTGGTT TGGCAGTCAC TTCCCCATTT TGGTGTCGCG TTTTGGGTTT 
TGCCCTGCCC CCAGC2AACG TG GAT CAT AT TCTTTCCCGT CAGGGGAGTC- A 2 AAG CT AT A 
GGACAGAAAG GTCACCTGGC CCAAACGGAG GATCCTAGGT GGGTGTGCAT 77ATTAGACC- 
TTGGTGTGTT GAAGGACGGA TCAGGCGGGG AGGAGGGGGT GGGGGAGACT TACTGCAGCA 
CTAGGTTAGG TTGAAAGCCG GGGTAAAAGG CGTGG CTAAA CAACACCTAT A2TACTTGT7 
ATTGTAGGCC ATGGCGGCCG AGGATTTC CT AACCATCTTC TTAGATGATG ATGAATCCTG 
GAATGAAACT CTAAATATGA G C G G AT AT G A CTACTCTGGA AACTTCAGCC 7AGAA3TGAG 24180 
CGTGTGTGAG ATGACCACCG TGGTGCCTTA CACGTGGAAC GTTGGAATAC T2TCT2T3AT 2424 C 
TTTCCTCATA AATGTTCTTG GAAATGGATT GGTCACCTAC ATTTTTT 3 3 A AGCACCGAT7 2430C 
GCGGGCAGGA G CG AT AG AT A TACTGCTCCT G3G7ATCTGC CTAAACTCGC TGTGTCTTAG 2 436C 
CATATCTCTA TTGG CAGAAG TGTTGATGTT TTTCTTTCCC AATATCATCT 22ACAGGCTT 
GTGCAGACTT GAAATTTTTT TTTACTATTT ATATGTCTAC TTGG AT AT 2T T2AGTGTTGT 
GTGCGTCAGT CTAGTGAGGT ACCTCCTGGT GGCATATTCT ACGCGTTCCT GGCCCAAGAA 
GCAG72CCTC GGATGGGTAC TGACATCCGC TGCACTGTTA ATT G C ATT G G TGC7G7CGGG 
GGATGCCTG7 CGACACAGGA GCAGGGTGGT CGACCCGGTC AG C AAG CAG 3 C2ATGTGTTA 2466C 
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TGAGAACGCG GGAAACATGA CTGCAGACTG GCGAC7GCA7 GTCAGAACCG TGTCAGT7A2 Z4~=: 
TGCAGG7T7C CTGT7ACCCC 7GGCCCTCCT TAT7CTGTTT TATGCTCTCA CCTGGTGT37 24 7 EC 
GG7GAGGAGG ACAAAGCTGC AAGCCAGGCG GAAGG7AAGG GGGGTGA77G TTGCTGTGG7 24 84C 
GCTGCTGTTT TTTG7G7TT7 GCTTCCC7TA C C ACGT ACT A AATCTACTGG ACAC7CTGCT 24 900 
AAGGCGACGG TGGATCCGGG ACAGCTGCTA TACGCGGGGG TTGATAAACG TGGGTCTGGC 2496C 
AGTAACCTCG T7AC7GCAGG CACTG7ACAG CGCCGTGGTT CCCCTGATAT ACTCCTGCCT 2 5020 
GGGA7CCC7C 7TTAGG GAGA GGATGTACGG TCTCT7CCAA AGCCTCAGGC AG7C77TCA7 
7CCGGCGCC ACCACGTAGC CCGCGGATGT CTACGTGCCC 7TCCCCCTTA AT77AATCTA 
GCCTCCCGTT CCCA7GATGC AGAGAGGCGA ATTTGGTTTG TACACAGATG TGACTATGTA 
TTTGTTTTAT TATGCGATTA AATGAGGGGT C7GA7CCCAA AAGCAATGTT TAGTGGTGGT 
CG77GA7C7T CT7GACGCTC CATAGGTAGA TTGACTGGAA CGCCATGGCC CACGGGGACA 
7GGACAGGGG TGTTAGGTCT GGTGGAACAT GCTGCCACTG CCACGGATGG AA CAT C AGAG 
ATGGGTCTAT GATCAGGGCA GCGTGTCGCC CGTCACTGGA TGTAAGTCCG GCCACCGTGG 
AGTTGCCTGT GG3G777C7G GGATAGTGTC TGGCTGGCAG GGTCTCATCC GCGGCATTTC 
CATGGTAGGT GAGGGTTATC TCGCCTCGCT GTCTCAGTAT GTACTCGAGG GCG7CCTGC7 
CGTACCGGAC CCCCAGGTAC TCTCCCTGGG CCCAGCTGGG CAGCACCGTC CCCCGCAACA 
CTCGGAGGAA AACG27C77A G7G77C7GAG GGATCTGTAT GTTTAGCCAG 7GGC7GTCA7 25680 
AC AG C77GG A CACGTTGGTC TCCAGGTTTA CCGCCCAGCG CTGGGGTGGT GTGGGTCCGT 
ACG7G7A7GG 7GAGGA77CC GACCGGCCCA C7ACACCCAG G3CCACCAGC AGCT3GAAGC 
CCACCTCGCC AC AGCAGATG GAGAATGTGT CGGG7CTGT7 7AGAAAC7C7 G7CAGGG7GG 
AGGCACAGG7 AGGG7CGT7A 2ACAGCGCCA GGACCCATCC CCTGGCGCTG GCG7AGC7GG 
CCTGGCAGCC TG77 CTGAGA CAT G 7 AAT C A GACCAGAGAA CCCCGACAAG GAC7G7CC7C 
G77TAAGCTC T7CCACA37C AC2G7GGCCA CC7CAAAGCC CG7GT7CTGC AACGCGGCCA 
TGAGCGCG7A CGGGGCAC7G C7CCCAGGCA GCACCAACGC GGCCACACGG CGCGGGGAGG 
TGGGGCACGA AAACAGGCGC AGC7GAC7CC CAAGGCACAT GGCCCTTAGG CTGCCCAGGT 
GATGCTCCAG ACGACCCAGG 7CC77CC7G7 GCA7G7CC7C CAG7GGG7GC AGGGGAGGCG 26220 
7 C AC C AGG 77 CCACA77TCG T C AG AAAAGG AGG7CCA7GA GAC77GCAAG GAAG7CAGGG 
TCTC77GAAA CACAACTGTC 7CG77C7GCA AAACCG7GAC G77G77GCC7 TG7C CC7CGG 
GGCCAACGG7 GCCCA37GG3 TGTGCCACGC AGCGGTAGTC CCTGGCCGCC CGCAGCACC7 
C7GACAAG7G 7 AC C7GGG G C ACC7CAACCA GTGC2CCAGG GG7C7C7GAA AC CAT AAG77 
CGAGCGGG77 AGGG7GGGCG GG7AGTGAGA GC7GCAG7CC CC7GCAGCCG GCCAGGGCCA 
7C7CGA7TGC AGA7GGGAGA AGCCC7CCG7 CCCC7ATGTC G7GCCCAGA7 ACAATGAGCC 
TC7TGGACA7 CAGGTAC7TA ACAAG CATGA ACAGGCTGGC GACCGTGGAC GGG77CAGAG 
GGGG7A77GG GTGCC7GGA7 GCCAGGAAG7 TGTG2TCGAA GGTGGACCCG GC7A7GAGAC 
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AGCTCTGATT CACGGCCAGG TATACCAGGG CGTTGCCTTC GACCTT7ACG TCCGGGGTGA 
CCCTGTATC7 GGATCCCTTG ACCTCGGCCC AG CTGGTAAA CACCACCGAG TTGAAGGGAA 
GGACCTCCAC CGTTTCTTGC TGTTGTGTGA TGCGCACATG GCGCTCCGAA AGCG7CGGAG 
AGCTGGCAGC CGAGGAGATG GACAGTGCCA CTCCCAGCTC CCGGCAGAAT TCCTTGCAGG 
CGAAGAGGCA CTCCTGTAGG AGGCCGGCTT GGTGGTCCTC TGGACTCCAC GCCACGGCGC 
CAGTTAGCAC TACGTCCTGG AG CTTGG AC A CGGGACTGAA CATGAGGTTC- GTGAGAGCCT 
CGGTGATGGC ATAGGTGGCC CCGGTGGATA CATTAGTAGC CATCTTGTAG GCCTGCTCCC 
CCATGGCCAT TGCCTGACCC CTCCACGCTG GCACTGGAAG CAGCTCCTGG GGCAGGGCC7 
TCACCCAGGT CTCGAAGTCC TTGTGTAGGA GGTTGGCCAT GGACGGAGTG ATGGCCTCCA 
CCGTGTCGGG CACTCTGGGC GCCACCCTCT CGGCCAGCAT GGACGAGTGC AGCACCAGGT 
GGTAGTC7GA AACCGGTATG TCCAGGGGTC CCACGCCAGC CTGTTGGGCG ATGAGGCCGT 
TGG AG CAT CG GTCCATGTGT CGCGTAAAGA ACTCCTTGCT GCCAACCGTC GAGTGGCGAA 
GTAACTGGTG GATTGTGGAG CCGGTGGCAA AAAGGCCCCA GTCAACATCC TCGGGGTGCC 
CCGAGACGCG GACACCATCG GACAGCGCCA GCCAGGGGGA CGGGGGGGTG GACGACGGC7 
GGTCTACAGA GAAGACCCTC GTGGTCTCCC CGGTCAGGTC GTCTACTATT CTGATGCCTG 
GGTGCTCCGA GGTCCTCCCG AG G AC CG TT A CCTGGCACGC GCACAGGCGC GCGGCGCGCT 
GCAGTACCTC CAACGGGGTC T CG C CC AG AT CCCCAGGCAC CCCGCCCGAC TCTGCCACCA 
CCGCAAACAC CAGGGAGCAA TACACGTTGA GAAAGTGCTC TGCCACCGCC GCCTTCACGG 
CATCCGGACC GGCCGCGGGA TCCGCAGGCA GGTGGGTGCG CACCTCGTCG GGTAGCTTGG 
AGACAAACAG CTCCAGGCCG GTCCGCGGCG CCAGCGCCTG CAGGTGCCTC ACCACCGGGG 
CCGGGTCATG CGATCTGTTT AGTCCGGAGA AGATAGGGCC CTTGGCAAG C CGCTGGACCA 
GCTTCAGGGT CTC C AAGATG CGCACCGCAT TGTCGGAGCT GTCGCGATAG AGGTTAGGGT 
AGGTGTCCGG TCCATCCGTG GGCTCAAACC TGCCCAGACA CACCACTGTC 7GCTGGGGGA 
TCATCCTTCT CAGGGAGATG CATTCTTTGG AAGTAGTGGT AGAGATGGAG CAGACTGCCA 
GGGCGTTGCC AGGAGTGGTG GCGATGGTGC GCACCGTTTT TAAGAAACCC CCCAGGGTGG 
GGACTCCCGC 7CCCTGCAGC ATCTCGGCCT GCTGTACGCC CTTGGCGAAT ATGCGACGGA 
ATCGGCTGTG CGCACGGGGT CCCAGGGCCG GTTCGGTGGC A7ACAGGCCG GTGAGGGCCC 
CCTGTGTCTG 7CCGCCTGGA AACAGGGTGC TG7GAAACAG CAGGTTGCCA AGGCCG CGAA 
TACCCC7CTG CACGCTGCTG TGGACGTGGG TGTACGCTCC GTGGATCCCG AACGCCTGTC 
TGGCACAGT7 CCAGGGCCAC CGTTCCATGG TGCATCTTCC CGG7ATCACA AAGTACCTGG 
CCACGTTATA ATTGTCCCCG GTTGAAGCCT GCACCGCCAG CGGTAGCAGG 7CTGCCCCCA 
GGGATATCAT AACAGCCTGC ATAATGACAT CATCT7CAAT GTGTGGCCTA GCCACGGGCT 
GGGGACCCTC GGGCACTTCC AACCCCTCGT ACGGTACCAG GTCGG7AT7T T2TGTAAATG 
CTGAGG7GGG TGTGGTTCTA GCAGGGTC7G TGTGATTT73 GACACCAGCT 
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GCCTGCCCAC TTCCACTCTA GCCCACTCCT GCAATCCTAG CT CTTG C AG C AGAACTGCAA 
GCTCTGTTGA CAATGTTGTG GGCCGGTGGT G CATGTTTGG CCCGTAGCCA AAGGATACAA 
CACGCTCGCT CCCCCGTGGC ACAGACCGCC TGATGACATG GGGATATCCA AGGAGCGGTG 
AC AG C AC AG C GAGCACCGTC TGTATTTCCA CATCCCGTCT CTCTCGCTCC TCCCTCGAAG 
TGGGAGGTCT TCGGAAAGTT AT C CAT AG C A GATAGTAGCC TCCGGTGCCA CCGGGTACGA 
GAGTGAG7GT GCCCGTACGG CTTG TAT AAA AGTT C AC AAA AGCTTCCTCA TCCGCGGTGA 
GATCACTCTC CAACCACAGC CCAGTGACGT CGTAGGCCAT GCCTAGAGGG CGCACCSCCC 
CCGGGGACAC CCTCTGTAGT CAGGCTGCCG AGAAACCCGC GAGATCTCTG G GG AG TAG G A 
AGAAACTTAG AATCCCCAAA TATGTCGCAG TCACAGGTTG TCGGGCAGAG TCTGTTTCCG 
CTTTCATGGG ATCCACAGTT ACTTGTAGCC ATGTCACTAA CCTCAAATAC TCAAAAAAAG 
CTATCGA7GG AAAAATG CTG TGGTCCTAGG TTAGTCCGTG GGAAACAAAA CTTCCTCATA 
CACTTCA7C7 GCAGGCTGAA ATGGTGGCGG ATCCAGACTC CTTACACCAC AGTTGCTCAC 
ATTAGAGATA CCTGATTGGT TAATACAAGC GGACGCACGC GTTGGTGGAG GCGTGTTGTC 
GCCCAAGA7A CTAGCATAGG TGACTGTGCG TTCGCTATGT AGTTGCTGCA TTTCAAGTTG 
GGTCGTTACT TCTGTGTTGC AAACCCTTAC TGGAGATAAT GCCATGTCTG TTGTGGAAC7 
TAAAATACGC GAG7G7ATAA CATTTCTAGA TGGTAGAGGT GGTAAACGGC GAGCTAAATG 
A7TAACATCG GGACATATCC TGCCTGCATG AGCATG7GGT GTGTCGTGTG G TG T AT AT AT 
TGGTAATCTT GTTGTTACAT TGTTGAACGA CACAAGTCTG CTCTCTCGG7 AGAGATAACC 
CACCAGTACG GCTTGGCCAG TACCTAATAA GAAAAAATAA AATCGTTAAT CTCTGTTTTT 
ATGTGGCGCT GGTGTTCCAA TTATAAATAA AAACACAACT CACTTAATAT CACAATTACA 
CAAATCAGTC CTGAAGTAAC ACCTGTAGTC CAACCGTCAG TGTAGAGCAG GACTAACTTA 
ACACAGCATC CAGCACATGT CCATGCTAAG GAAATAAACC AAAGTTATGT TTCGGTTTGC 
TTTATGACCA GGGAGCTGCT ACCCAGGTAC AAAAAATCC7 TACCCAAAAA TAG AAA C AG G 
AAGCCACCAG AGAGTGAAGC TTTGTGAAAG CTT7GCCAGC AGAAGAAACA ATATAATAAA 
AAGCCACAGC CTGCTAGTAA TGTTATACTC CC7G7AAATA AAAAATAT G G ACAGTAATAA 
TTTATGACAC CCAATAAGTA TGTGGAAAAA ATGTAATGTA AA C C A CT AT A CTGGTAAAAA 
CATACCT7 CG TTATTGGTGT CTTGTTCGCG CTTTATAAAC AGTATCCCTA TTGTTGTGG7 3 0360 
TAGTGTAACC AACACTCCTC CTTGTAAAAG TAAAAATGAC ATAAGCCCCT TAGTTGA7CC 2 0420 
AATCCAA7GT CGTTTCATTG TTATAAACAA GCCGGTCATA CCTGTAATAA A G TT ATT CAT 
TACAAAA7GT TATAATAGTA TTGGTAATGT TTAGTTAAGA TAATGTAAAC TTCACAGTAG 
TCATATACCA AT AT GT AT G C AGCTTATGCA TCCTGCGATG ATTACAGAAA GGCA7GAATG 
GGAAACG CAA AAAAAGGCCG GTGTTGCCTT GAGTATACCT GTAGTAAAAA ATAAATAATA 
TTGTTGG7TG CAATGCTTAG GTGCAAGCAG ACATAATTGC ATAGCAGTAA AAA CC AG ACT 
TACCACCACA TAT7GCAAAC ACACATGCAG CG AG CTTG AG ACAAGGCCCA 7TATCTGT7G 
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CAAAGATATG TATAAAAAAA ACAAGCAACA ATGTCCATAA TGGCAAAAAA AACTGGCAA7 
GTGTCCAGTT GTTGTAAATC TGCAATCCCA TTGAGAATAT AAGTACCAAC A C C AT AAC AA 
TG C AC AGTAA TCCGCTATCA AT AG TG C ATT TAACGACTCT TAATGTTCCA CCAAGTGATA 
GAATGGCTGA AAAACACATA CAGGGGAATT ACGTTTTTTT AAAAAATTGG AAATATTAGA 
TACATAATTT TTATTTAATA AAAAACCTTT AGTAAAACTT ACCAGTAATT ATAGACAA7A 
AACTTATAAT ACAAACACAA A C AG TACT C A AAGTACTTTG AGTAGAGAAA C7CCAACTGG 
CAAAGGCAA7 ACATCCTAAA ACAAAAGACA AATACACGAG ACATTTAAAC AATGTATACT 
TAG AAAG AAA TAAGTTAAAC ATTTAAAAAA TGTAAGTTAC CAACAATTAT AGATGGTCCA 
ATGGGAGGGG AAG CTTGAAA ACGTTGTTTT TTTGACTGCA CATATATGTT GTTATTGTAC 
AAAAAAGTTG GTAGTAAACA CTTATGTTAC TGAGCAAAAA TATGGTGTTT TGTAAATTTA 
TAG TT AAAAG ACAAAACATA ATAGACAAAC ACCCACAACA TGTTATAAGT GCTGCAAACC 
AAGTACCCCA CAGGTATTTT TTGTAATTCA TTGTAGACAA AAAG C C C AAG GCCCAAAAA7 
GAAGTGGACA AAAGAAATAT GTAATTAAGT GTAGTTGGAC AAGGAATTAT ATAGC7GGAT 
GAGTTAGTTT TGCACAGAAC CAGACATCCT ATTTTTGTTT GGAAACCTAA AATCCGGATG 
AAGGGCTTAT AAAATGGCAC AGCTGCAAAA AG CTGATAAT GTAACACTGC ATCCTGGTG7 
TTTTGATTGT AGCGGAAAAA TGTAATAAAT TTT A C AG AC A GTTTTGCCTA CTGAGAACA7 
GTTGAAAAAA AGGCACTAAG GGCTTTTTT3 C C AAAG G AAA AATGCCCCCG TGG33TTAGG 
GGAAAGGGGG GATGGGGTGA TGGGGGAATG GTGGGAAAGG GGGGATGGGG TGAT33GGGA 
ATGGTGGGAA AGGGG7GATG GGGTGATGGG GGAATGGGGG GAAAGGGGGA ATGGGGGGAA 
AGGGGGAATG GGGGGAAAGG GGGAATGGGG GGAAAGGGGG GATGGGGGGA AAGGGGGAAT 
GGGGGGAAAG GGGGAATGGG GGGAAAGGGG GGATGGGGGG AAAGGGGGAA TG3GG3GAAA 
GGGGGGATGG GGGGAAACGG GGGATGGGGG GAAAGGGGGG ATGGGGGGGA AAGGGGGGA7 
GGGGGGGAAA GGGGGGATGG GGGGGAAAGG GGGGATGGGG GGGAAAGGGG GGATGGGGAA 
GGGGGGGGGG AGGGGGAAGG GGGTGAAGGG GGAAGGGGGG AGGCGAA 
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What is aimed is : 
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An isolated nucleic acid encoding a 
sarcoma-associated herpesvirus pc 
selected from the group comprising: 

viral macrophage inflammatory pro: 
viral interleukin 6; 
viral interferon regulatory factor 
complement -binding protein; 
glycoprotein B; 

capsid protein IV encoded by ORF € 
immediate early protein encoded by 
glycoprotein M; and 
alycoprotein L. 
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The synthetic DNA of cl=" : 



aim j. 



The genomic DNA of claim 1. 



20 



4 . The c DNA of claim I 



The RNA of claim 1 



A repucaoie vect 
of claim 1 . 



-cr comprising nucleic acic 



A host ce_: comp 



r.Drisincr the vectcr z _ claim c 



'T , v ie eukaryo-ic cell of claim 7. 



The bacterial cell of clan 



10. A plasmid, cosmic, X pnage cr . 
isolated nucleic acid of cj-aim 



•v- YP-.Z wOrr.crisir.r tne 



A nucleic acid of at least 14 nu: 
cf specifically hybridizing w: 



;cides capaoie 
the isolated 
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specifically hybridizing with tr.e isolates 
leic acid of claim 1 . 



12. Tne nucleic acid of claim 11 which is iabeiec 
with a detectable marker. 

13. The nucleic acid of claim 12, wherein the marker 
is a radioactive, a colorimetric , a luminescent, 
or a fluorescent label. 

14. An isolated polypeptide having the amine acic 
sequence encoded by the nucleic acid of claim l. 



15i The polypeptide of claim 14 linked to a second 
o-clypeptide to form a fusion protein. 

16. Tne fusion protein of claim 15, wherein the 
second polypeptide is beta-galaccosicase . 

1". Ar. antibody which specif ica_iy nines - ne 
c c 1 voeot ide of claim 14 . 

13. The antibody of claim 17, wherein the antibody is 
co ly clonal antibody. 

19. Tne antibody of claim 17, wherein the antibody is 
a monoclonal antibody. 

30 20. A host cell which expresses the polypeptide of 

claim 14 . 

21. A vaccine which comprises an effective immunizing 
amount of the polypeptide of claim 14 anc a 
35 pharmaceutical^ acceptable carrier. 

An antisense molecule capable cf specifically 
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hybridizing with the nucleic acid of c^aim 

22 . The antisense molecule of claim 22. wnereir. tn~ 
molecule is a nucleic acid derivative . 

24 . A triplex oligonucleotide capable of specincaily 
hybridizing with the double-stranded nucleic acic 
of claim 1. 

25. A transgenic no n hum an mammal which comprises tne 
nucleic acid of claim 1 introduced into the 
mammal at an embryonic stage . 

26. A method of diagnosing a DNA virus associated 
15 with Kaposi's sarcoma in a suoject wmcn 

comprises : 

(a) obtaining a nucleic acid sample from the 
subj ect ; 

(b) contacting the sample obtained :n step (a) 
with the labeled nucleic acid of claim 12 
under . high stringency hybridization 



10 



2 0 



30 



J 3 



cf anv labeiea 



conditions ; 
{c; detecting the presence 

nucleic acid hybridized ir. step 'b; .. tne 
presence cf which is indicative cf a DNA 
virus associated with Kaposi's sarcoma, 
so as to thereby diagnose a DNA virus associates 
with Kaposi's sarcoma in the subject. 

27. The method of claim 26, wherein the sample 
comonses a bodily fiuici. 



The method of claim 27, wherein the bodily fluic 
comprises serum. 



29. The method cf claim 26, where::. tr.e sampie 
comprises a tissue specimen. 
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wherein tr.e tissue 



3 q _ The method of claim 29, 

specimen comprises a tumor lesion. 

31. The method of claim 26 wherein the nucleic acic 
is amplified before step <b) . 

32. A method of diagnosing a DNA virus associated 
with Kaposi's sarcoma in a subject wmcn 
comprises : 

(a) obtaining a sample from the subject; 

tb) contacting the sample from step (a; with a 
support having already bound theretc the 
Kaposi's sarcoma antibody of claim 17, so as 
to bind the antibody to any specific 
Kaposi's sarcoma antigen present in tne 
sample ; 

iz) removing any unbound material frorr. tne 
support of step (b) ; and 

id) detecting the presence of any specific 
Kaposi's sarcoma antigen bound by the 
Kaposi's sarcoma antibody in step (c. tne 
presence of which is indicative of tne DNA 
virus associated with Kaposi's sarcoma, 

so as tc thereby diagnose the DNA virus 

associated with Kaposi's sarcoma ir. the subject. 

^ - _ •_ - - w wore-^ the samoie 

33. The metnoa or claim -s^, w -- *- 

comprises a suitable bodily fluic. 

34. The method of claim 23, wherein the bodily fluid 
comprises serum. 



35. A method of diagnosing a DNA virus associates 
with Kapc 
35 comprises: 



with Kaposi's sarcoma in a subject wmcn 



(a) obtaining a suitable bodily fiuic sample 
from the subject; 
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c c £ 
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contacting -he sample from step (a) 
support having already bound thereto a 
Kaposi's sarcoma antigen encoded by the 
isolated nucleic acid of claim 1, so as to 
bind the antigen to any specific Kaposi's 
sarcoma antibody present in the sample: 
removing any unbound material from tne 
support of step (b) ; and 

detecting the presence of any specine 
Kaposi's sarcoma antibody bound by the 
Kaposi's sarcoma antigen in step (ci , tne 
presence of which is indicative of the DNA 
virus associated with Kaposi's sarcoma, 
so as to thereby diagnose the DNA virus 
iated with Kaposi's sarcoma in the surnect. 



Ic) 



(a) 



asso: 



- i j _£ a-;™ ~ - wh^ro-r the samme 

6.. The method or ^iaim -j - , W1 --- 

comcrises a suitable bodily fluid. 



37. The method of claim 36, wherein the bodily fluid 
comprises serum. 

33. A method of treating a subject ir.fectea witn 
Kaposi's sarcoma- associated herpesvirus 
25 comprising administering to the subject an 

effective amount cf an antisense molecule of 
claim 22 under conditions such that the antisense 
molecule selectively enters an infected cell or 
the subject, so as to thereby treat the subject. 



A method of treating a subject infected witn 
Kaposi's sarcoma- associated herpesvirus 
comprising administering to tne suoject a 
pharmaceutical^ effective amount cf an antiviral 
agent m a pharmaceutical^ acceptable carrier, 
wherein the agent specifically cir.as to tne 
polypeptide of claim 14. 
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40. A method of prophylaxis or treacmen: tcr « 
subject infected with Kaposi's sarcoma-associateo 
herpesvirus comprising administering to tne 
subject the antibody of claim 1 7 - n a 
pharmaceutical^ acceptable carrier. 

41. A method of vaccinating a subject against 
Kaposi's sarcoma- associated herpesvirus 
comprising administering to the subject an 
effective amount of the polypeptide of claim 14 
and a pharmaceutically acceptable carrier, so as 
to thereby vaccinate the subject. 

42. a method of immunizing a subject against a 
herpesvirus associated with Kaposi's sarcoma 
which comprises administering to the sucject an 
effective immunizing dose of the vaccine cf claim 
21 and a pharmaceutically acceptable 



30 45 



carrier 



The antibody cf claim IS, which antibody is 
specifically immunoreactive with peptides 
encoding an antigenic porticn oz viral 
ir.terleukin -6 . 

The" antibody cf claim 43, wherein the antigenic 
portion of viral mterleukm € comprises the 
amino acid sequences as set :or... — - * w - ^ 

and SEQ ID NO : 3 . 

The method of claim 40, wnerein the antibody is a 
chimeric antibody. 

The method of claim 4C, wherein the antibody is a 
humanized antibody. 



47. a method of passively immunizing a suspect 



lainst a herpesvirus associated with Kaposi's 
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sarcoma which comprises adminisi snng tc tne 
subject an effective immunizing amount of tne 
antibody of claim 43 and a pharmacsut icaiiy 
acceptable carrier. 
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Nde II 
1 1 2 3 
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FIG. 2C 
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FIG. 2D 

Tag I 
1 1 2 3 4 
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FIG. 4A 
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FIG.4C C 
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FIG. 4E 
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FIG. 4D 



Probe: p-actm 



Probe - p-actin 
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2 " □ be^L^y relate to parts of the international application that do nor comply with the prescribed requirements to such 



in extent that no meaningful international search can be carried of* specifically: 



3. Claims No*.: , M , 

1 1 toc y are dependent cUuma and an. not drafted m accordance with the accood and dud •« 



usees of Rule 6.4(a). 



Box 



II Ob-nr.doo. where unity of Lovotion is l.ekinf (CoBttnu.ttoi. of turn 1 of firat aheat) 



This Internabooal Searching 

Please See Extt* Sheet. 



Searching Authority found multiple inventions in this internanooal appi.caoon. as follows: 



1 rn A. all required additional search fee. were timely paid by the applicant, tins intern.nonai search repon covers all searchable 

cluDI. 

*• □ 



A. .11 ««b.bU cl..n,. could be «.rcb«J w,0.og. .to ,u,ufyu, g u .dd.bbn.l fa.. Urn Aulhon.y d.d .o. p.yo>«. 
of aov addibonai fee. 



3 | | a. only tome of the required additional aearc 
only those claims for which fees were paid. 



b fee. were umeiv paid by the applicant this iDiernaoooal search report cover, 
specifically claims Not.: 



QNo required add.uoo.l search fee. were nmcly paid by the eppimanL Consequents, dm mtem.uoo.i search report is 
restneied to the invenooo first meaoooed is the claims; it is covered by claims Noi.: 
1-10 



Remark on Protest Q The addioonai search fee. were accompanied by the appi.canrs prote.L 

j [ No protest accompanied the payment of additional tearch fees 



Form PCT71SA/210 (cooonuauon of Oral shoe<l)XJuty 1992)* 



INTERNATIONAL SEARCH REPORT 



International application No 

PCTVUS97/I3346 



BOX II. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
This ISA found multiple inventions as follows: 

Thi. .ppl-uo. coo-in. fouling invenuon, or group, of in.enoo., which .re no. » l.nkcd - fom, . .-glc 
inventive concept under PCT Rule 13.1. 

izi !; ^ l ££:z\r££: . . ***** - — » f — , - 

G Zl V.".Z?.) 11-13. 22-23. "« 3*. » »°'" u " " d 

Group VI. claim 24. drawn to a tnplex oligonucleotide. 
Group VII. claim 25. drawn to a transgenic animal. 

Group VIII. claim(s) 26-31. drawn to a method of using- an anusenie molecule. 
Group IX. claim(s) 32-34. drawn to a method of using an antibody. 
Group X, claim(s) 35-37. drawn lo a method of using on antigen. 



The mvennons listed as Group. i-X do not relate to a single tnvenuve concept under PCT Rule 13.1 ^^^L 
Rule 13.2 thev lack the aame or corre.pood.ng specal technical feature, for the foUowmg reasons: DNA aequencea. 
Lu Jnae molecules, and poiypepude sequence, of KS-a.aoe.aied herpesv.ru, are known (Chang e. al. Science. 16 
S^IL" Vol. 26^" 1865-1*69). Thu. the nucleic acid molecule of Group I lack, un.ty w.th the 
invention, of Group. HOC. 

The product, of Groups 1-V11 are chemically, structurally, biologically, or immunologically from each other. 
Furthermore, there are more than on. known method for u«ng theae product,, such a. .mmunoaaaavs. blomngs. 
hybridization prooe.. vaccines, therapeutic., gene therapy, and expression vectors. 

The method, of Group. VUl-X h.ve different step, and utilize reagents wh.ch are chemically, structurally, biologically, 
or immunologically distinct from each other. 

Accordingly, the claim, are not ao linked by a special techn.cel feature w.thm the meanuig of PCT Rule 13 J so a. to 
form a single inveaove concept. 
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